Wikitech labswiki https://wikitech.wikimedia.org/wiki/Main_Page MediaWiki 1.47.0-wmf.3 first-letter Media Special Talk User User talk Wikitech Wikitech talk File File talk MediaWiki MediaWiki talk Template Template talk Help Help talk Category Category talk Obsolete Obsolete talk OfficeIT OfficeIT talk Tool Tool talk Nova Resource Nova Resource Talk Heira Heira Talk TimedText TimedText talk Module Module talk User:JackPotte 2 2862 2418620 2177972 2026-05-23T18:02:09Z JackPotte 923 /* Deployment */ 2418620 wikitext text/x-wiki {{#babel:fr|en-4|de-2|es-2|pt-1|it-1|ru-1|ar-1|hi-1|zh-1|ja-1|la-1|el-1|vi-0|id-0|ko-0|nl-0}} User UTC + 1 (Paris, France) == Own tools == * https://jackbot.toolforge.org/ * https://jackbot.toolforge.org/xtools/public_html/unicode-HTML.php == Hosted tools == * https://jackbot.toolforge.org/snottywong/ == Other tools == * https://toolforge.org/ ** https://xtools.toolforge.org/ ** https://anagrimes.toolforge.org/ ** https://wikt-mwtest.toolforge.org/core/ == Shell == === Connect to the wikis replicas === <syntaxhighlight lang=bash> $ mysql --defaults-file=replica.my.cnf -h enwiki.labsdb MariaDB [(none)]> connect enwiki_p ... $ mysql --defaults-file=~/replica.my.cnf -h enwiktionary.labsdb enwiktionary_p MariaDB [(none)]> connect enwiktionary_p ... $ mysql --defaults-file=replica.my.cnf -h frwiktionary.labsdb connect frwiktionary_p </syntaxhighlight> === Create one's own database === <syntaxhighlight lang=bash> mysql --defaults-file=replica.cnf -h tools-db ... create database p48358730291690573246813765835736425432__mycooldb; ... </syntaxhighlight> == MySQL == Executable on https://quarry.wmflabs.org/. See also https://meta.wikimedia.org/wiki/Research:GCI_Wiki_Study/2018. === Page content by title === <syntaxhighlight lang=mysql> SELECT CAST(pp_page AS CHAR(1000000) CHARACTER SET utf8) FROM page JOIN page_props ON page_id = pp_page WHERE page_namespace = 0 and page_title = 'jackpot'; </syntaxhighlight> === Page name by content === <syntaxhighlight lang=mysql> USE frwiki_p; SELECT p.page_title FROM page p JOIN page_props pp ON p.page_id = pp.pp_page WHERE p.page_namespace = 0 AND pp.pp_page REGEXP '\n *titre *=' </syntaxhighlight> === Pages modified by === <syntaxhighlight lang=mysql> SELECT page_title FROM pp_value JOIN page ON page_id = pp_page WHERE pp_value = 'JackPotte'; </syntaxhighlight> === Edit count === <syntaxhighlight lang=mysql> select user_editcount from user where user_name='JackPotte'; </syntaxhighlight> === Created pages === <syntaxhighlight lang=mysql> SELECT DISTINCT page_title FROM page JOIN revision ON page_id=rev_page WHERE rev_user_text='JackPotte' and page_namespace=0 AND page_is_redirect=0; </syntaxhighlight> === Hidden editions === <syntaxhighlight lang=mysql> SELECT DISTINCT page_title FROM page JOIN revision ON page_id=rev_page WHERE rev_user_text='JackPotte' and page_namespace=0 AND page_is_redirect=0 and rev_deleted=1; </syntaxhighlight> === Count the number of editions on a day === <syntaxhighlight lang=mysql> SELECT COUNT(*) FROM revision where rev_timestamp like '20150531%'; </syntaxhighlight> == Crons == <syntaxhighlight lang=bash> toolforge-jobs list -l </syntaxhighlight> == Deployment == On <code>tools.jackbot@tools-login:~$</code>: <syntaxhighlight lang=bash> cd JackBot git stash git pull </syntaxhighlight> == Git == <syntaxhighlight lang=bash> apt-get remove git-review pip install git-review git review -s git branch git remote -v ssh jackpotte@gerrit.wikimedia.org:29418/test/mediawiki/extensions/examples.git git review -s git config -l git config --global user.name "jackpotte" git clone https://gerrit.wikimedia.org/r/p/test/mediawiki/extensions/examples.git git review -sgit pull origin master git pull origin master git checkout -b jackbot-1 master git diff git status git add test1.php git status git diff --cached git commit git pull origin master git rebase master git review -R cd .git git fetch https://gerrit.wikimedia.org/r/mediawiki/core refs/changes/69/17069/1 && git checkout FETCH_HEAD </syntaxhighlight> == Git / Gerrit == === Quiz === <syntaxhighlight lang="bash"> git clone ssh://jackpotte@gerrit.wikimedia.org:29418/mediawiki/extensions/Quiz cd Quiz vim Quiz.class.php git add Quiz.class.php git commit git fetch git push ssh://jackpotte@gerrit.wikimedia.org:29418/mediawiki/extensions/Quiz HEAD:refs/for/master # Error with a change ID git commit --amend # Insertion of the change ID at the last line git push ssh://jackpotte@gerrit.wikimedia.org:29418/mediawiki/extensions/Quiz HEAD:refs/for/master </syntaxhighlight> === Pywikibot === <syntaxhighlight lang="bash"> git clone https://gerrit.wikimedia.org/r/pywikibot/core.git cd core gitdir=$(git rev-parse --git-dir); scp -p -P 29418 jackpotte@gerrit.wikimedia.org:hooks/commit-msg ${gitdir}/hooks/ pip install -r requirements.txt cd scripts git clone https://gerrit.wikimedia.org/r/pywikibot/i18n.git # Modification here git add -A git commit -m "Add a French translation for clean_sandbox" git push origin HEAD:refs/for/master </syntaxhighlight> mkevj1x2qg0c71t6u28dwj5tgode0st 2418621 2418620 2026-05-23T18:03:29Z JackPotte 923 /* Git */ 2418621 wikitext text/x-wiki {{#babel:fr|en-4|de-2|es-2|pt-1|it-1|ru-1|ar-1|hi-1|zh-1|ja-1|la-1|el-1|vi-0|id-0|ko-0|nl-0}} User UTC + 1 (Paris, France) == Own tools == * https://jackbot.toolforge.org/ * https://jackbot.toolforge.org/xtools/public_html/unicode-HTML.php == Hosted tools == * https://jackbot.toolforge.org/snottywong/ == Other tools == * https://toolforge.org/ ** https://xtools.toolforge.org/ ** https://anagrimes.toolforge.org/ ** https://wikt-mwtest.toolforge.org/core/ == Shell == === Connect to the wikis replicas === <syntaxhighlight lang=bash> $ mysql --defaults-file=replica.my.cnf -h enwiki.labsdb MariaDB [(none)]> connect enwiki_p ... $ mysql --defaults-file=~/replica.my.cnf -h enwiktionary.labsdb enwiktionary_p MariaDB [(none)]> connect enwiktionary_p ... $ mysql --defaults-file=replica.my.cnf -h frwiktionary.labsdb connect frwiktionary_p </syntaxhighlight> === Create one's own database === <syntaxhighlight lang=bash> mysql --defaults-file=replica.cnf -h tools-db ... create database p48358730291690573246813765835736425432__mycooldb; ... </syntaxhighlight> == MySQL == Executable on https://quarry.wmflabs.org/. See also https://meta.wikimedia.org/wiki/Research:GCI_Wiki_Study/2018. === Page content by title === <syntaxhighlight lang=mysql> SELECT CAST(pp_page AS CHAR(1000000) CHARACTER SET utf8) FROM page JOIN page_props ON page_id = pp_page WHERE page_namespace = 0 and page_title = 'jackpot'; </syntaxhighlight> === Page name by content === <syntaxhighlight lang=mysql> USE frwiki_p; SELECT p.page_title FROM page p JOIN page_props pp ON p.page_id = pp.pp_page WHERE p.page_namespace = 0 AND pp.pp_page REGEXP '\n *titre *=' </syntaxhighlight> === Pages modified by === <syntaxhighlight lang=mysql> SELECT page_title FROM pp_value JOIN page ON page_id = pp_page WHERE pp_value = 'JackPotte'; </syntaxhighlight> === Edit count === <syntaxhighlight lang=mysql> select user_editcount from user where user_name='JackPotte'; </syntaxhighlight> === Created pages === <syntaxhighlight lang=mysql> SELECT DISTINCT page_title FROM page JOIN revision ON page_id=rev_page WHERE rev_user_text='JackPotte' and page_namespace=0 AND page_is_redirect=0; </syntaxhighlight> === Hidden editions === <syntaxhighlight lang=mysql> SELECT DISTINCT page_title FROM page JOIN revision ON page_id=rev_page WHERE rev_user_text='JackPotte' and page_namespace=0 AND page_is_redirect=0 and rev_deleted=1; </syntaxhighlight> === Count the number of editions on a day === <syntaxhighlight lang=mysql> SELECT COUNT(*) FROM revision where rev_timestamp like '20150531%'; </syntaxhighlight> == Crons == <syntaxhighlight lang=bash> toolforge-jobs list -l </syntaxhighlight> == Deployment == On <code>tools.jackbot@tools-login:~$</code>: <syntaxhighlight lang=bash> cd JackBot git stash git pull </syntaxhighlight> == Git / Gerrit == === Quiz === <syntaxhighlight lang="bash"> git clone ssh://jackpotte@gerrit.wikimedia.org:29418/mediawiki/extensions/Quiz cd Quiz vim Quiz.class.php git add Quiz.class.php git commit git fetch git push ssh://jackpotte@gerrit.wikimedia.org:29418/mediawiki/extensions/Quiz HEAD:refs/for/master # Error with a change ID git commit --amend # Insertion of the change ID at the last line git push ssh://jackpotte@gerrit.wikimedia.org:29418/mediawiki/extensions/Quiz HEAD:refs/for/master </syntaxhighlight> === Pywikibot === <syntaxhighlight lang="bash"> git clone https://gerrit.wikimedia.org/r/pywikibot/core.git cd core gitdir=$(git rev-parse --git-dir); scp -p -P 29418 jackpotte@gerrit.wikimedia.org:hooks/commit-msg ${gitdir}/hooks/ pip install -r requirements.txt cd scripts git clone https://gerrit.wikimedia.org/r/pywikibot/i18n.git # Modification here git add -A git commit -m "Add a French translation for clean_sandbox" git push origin HEAD:refs/for/master </syntaxhighlight> 1hyunn4sq512qhqjd6i0ufmow3ctwar 2418622 2418621 2026-05-23T19:02:11Z JackPotte 923 2418622 wikitext text/x-wiki {{#babel:fr|en-4|de-2|es-2|pt-1|it-1|ru-1|ar-1|hi-1|zh-1|ja-1|la-1|el-1|vi-0|id-0|ko-0|nl-0}} User UTC + 1 (Paris, France) == Own tools == * https://jackbot.toolforge.org/ * https://jackbot.toolforge.org/xtools/public_html/unicode-HTML.php == Hosted tools == * https://jackbot.toolforge.org/snottywong/ == Other tools == * https://toolforge.org/ ** https://xtools.toolforge.org/ ** https://anagrimes.toolforge.org/ ** https://wikt-mwtest.toolforge.org/core/ == Shell == === Connect to the wikis replicas === <syntaxhighlight lang=bash> $ mysql --defaults-file=replica.my.cnf -h enwiki.labsdb MariaDB [(none)]> connect enwiki_p ... $ mysql --defaults-file=~/replica.my.cnf -h enwiktionary.labsdb enwiktionary_p MariaDB [(none)]> connect enwiktionary_p ... $ mysql --defaults-file=replica.my.cnf -h frwiktionary.labsdb connect frwiktionary_p </syntaxhighlight> === Create one's own database === <syntaxhighlight lang=bash> mysql --defaults-file=replica.cnf -h tools-db ... create database p48358730291690573246813765835736425432__mycooldb; ... </syntaxhighlight> === Crons === <syntaxhighlight lang=bash> toolforge-jobs list -l </syntaxhighlight> === Git / Gerrit === ==== Quiz ==== <syntaxhighlight lang="bash"> git clone ssh://jackpotte@gerrit.wikimedia.org:29418/mediawiki/extensions/Quiz cd Quiz vim Quiz.class.php git add Quiz.class.php git commit git fetch git push ssh://jackpotte@gerrit.wikimedia.org:29418/mediawiki/extensions/Quiz HEAD:refs/for/master # Error with a change ID git commit --amend # Insertion of the change ID at the last line git push ssh://jackpotte@gerrit.wikimedia.org:29418/mediawiki/extensions/Quiz HEAD:refs/for/master </syntaxhighlight> ==== Pywikibot ==== <syntaxhighlight lang="bash"> git clone https://gerrit.wikimedia.org/r/pywikibot/core.git cd core gitdir=$(git rev-parse --git-dir); scp -p -P 29418 jackpotte@gerrit.wikimedia.org:hooks/commit-msg ${gitdir}/hooks/ pip install -r requirements.txt cd scripts git clone https://gerrit.wikimedia.org/r/pywikibot/i18n.git # Modification here git add -A git commit -m "Add a French translation for clean_sandbox" git push origin HEAD:refs/for/master </syntaxhighlight> == MySQL == Executable on https://quarry.wmflabs.org/. See also https://meta.wikimedia.org/wiki/Research:GCI_Wiki_Study/2018. === Page content by title === <syntaxhighlight lang=mysql> SELECT CAST(pp_page AS CHAR(1000000) CHARACTER SET utf8) FROM page JOIN page_props ON page_id = pp_page WHERE page_namespace = 0 and page_title = 'jackpot'; </syntaxhighlight> === Page name by content === <syntaxhighlight lang=mysql> USE frwiki_p; SELECT p.page_title FROM page p JOIN page_props pp ON p.page_id = pp.pp_page WHERE p.page_namespace = 0 AND pp.pp_page REGEXP '\n *titre *=' </syntaxhighlight> === Pages modified by === <syntaxhighlight lang=mysql> SELECT page_title FROM pp_value JOIN page ON page_id = pp_page WHERE pp_value = 'JackPotte'; </syntaxhighlight> === Edit count === <syntaxhighlight lang=mysql> select user_editcount from user where user_name='JackPotte'; </syntaxhighlight> === Created pages === <syntaxhighlight lang=mysql> SELECT DISTINCT page_title FROM page JOIN revision ON page_id=rev_page WHERE rev_user_text='JackPotte' and page_namespace=0 AND page_is_redirect=0; </syntaxhighlight> === Hidden editions === <syntaxhighlight lang=mysql> SELECT DISTINCT page_title FROM page JOIN revision ON page_id=rev_page WHERE rev_user_text='JackPotte' and page_namespace=0 AND page_is_redirect=0 and rev_deleted=1; </syntaxhighlight> === Count the number of editions on a day === <syntaxhighlight lang=mysql> SELECT COUNT(*) FROM revision where rev_timestamp like '20150531%'; </syntaxhighlight> qcga0kcwmv0uion6cp4i17ykztn59hd 2418623 2418622 2026-05-23T22:20:38Z JackPotte 923 /* Crons */ 2418623 wikitext text/x-wiki {{#babel:fr|en-4|de-2|es-2|pt-1|it-1|ru-1|ar-1|hi-1|zh-1|ja-1|la-1|el-1|vi-0|id-0|ko-0|nl-0}} User UTC + 1 (Paris, France) == Own tools == * https://jackbot.toolforge.org/ * https://jackbot.toolforge.org/xtools/public_html/unicode-HTML.php == Hosted tools == * https://jackbot.toolforge.org/snottywong/ == Other tools == * https://toolforge.org/ ** https://xtools.toolforge.org/ ** https://anagrimes.toolforge.org/ ** https://wikt-mwtest.toolforge.org/core/ == Shell == === Connect to the wikis replicas === <syntaxhighlight lang=bash> $ mysql --defaults-file=replica.my.cnf -h enwiki.labsdb MariaDB [(none)]> connect enwiki_p ... $ mysql --defaults-file=~/replica.my.cnf -h enwiktionary.labsdb enwiktionary_p MariaDB [(none)]> connect enwiktionary_p ... $ mysql --defaults-file=replica.my.cnf -h frwiktionary.labsdb connect frwiktionary_p </syntaxhighlight> === Create one's own database === <syntaxhighlight lang=bash> mysql --defaults-file=replica.cnf -h tools-db ... create database p48358730291690573246813765835736425432__mycooldb; ... </syntaxhighlight> === Git / Gerrit === ==== Quiz ==== <syntaxhighlight lang="bash"> git clone ssh://jackpotte@gerrit.wikimedia.org:29418/mediawiki/extensions/Quiz cd Quiz vim Quiz.class.php git add Quiz.class.php git commit git fetch git push ssh://jackpotte@gerrit.wikimedia.org:29418/mediawiki/extensions/Quiz HEAD:refs/for/master # Error with a change ID git commit --amend # Insertion of the change ID at the last line git push ssh://jackpotte@gerrit.wikimedia.org:29418/mediawiki/extensions/Quiz HEAD:refs/for/master </syntaxhighlight> ==== Pywikibot ==== <syntaxhighlight lang="bash"> git clone https://gerrit.wikimedia.org/r/pywikibot/core.git cd core gitdir=$(git rev-parse --git-dir); scp -p -P 29418 jackpotte@gerrit.wikimedia.org:hooks/commit-msg ${gitdir}/hooks/ pip install -r requirements.txt cd scripts git clone https://gerrit.wikimedia.org/r/pywikibot/i18n.git # Modification here git add -A git commit -m "Add a French translation for clean_sandbox" git push origin HEAD:refs/for/master </syntaxhighlight> == MySQL == Executable on https://quarry.wmflabs.org/. See also https://meta.wikimedia.org/wiki/Research:GCI_Wiki_Study/2018. === Page content by title === <syntaxhighlight lang=mysql> SELECT CAST(pp_page AS CHAR(1000000) CHARACTER SET utf8) FROM page JOIN page_props ON page_id = pp_page WHERE page_namespace = 0 and page_title = 'jackpot'; </syntaxhighlight> === Page name by content === <syntaxhighlight lang=mysql> USE frwiki_p; SELECT p.page_title FROM page p JOIN page_props pp ON p.page_id = pp.pp_page WHERE p.page_namespace = 0 AND pp.pp_page REGEXP '\n *titre *=' </syntaxhighlight> === Pages modified by === <syntaxhighlight lang=mysql> SELECT page_title FROM pp_value JOIN page ON page_id = pp_page WHERE pp_value = 'JackPotte'; </syntaxhighlight> === Edit count === <syntaxhighlight lang=mysql> select user_editcount from user where user_name='JackPotte'; </syntaxhighlight> === Created pages === <syntaxhighlight lang=mysql> SELECT DISTINCT page_title FROM page JOIN revision ON page_id=rev_page WHERE rev_user_text='JackPotte' and page_namespace=0 AND page_is_redirect=0; </syntaxhighlight> === Hidden editions === <syntaxhighlight lang=mysql> SELECT DISTINCT page_title FROM page JOIN revision ON page_id=rev_page WHERE rev_user_text='JackPotte' and page_namespace=0 AND page_is_redirect=0 and rev_deleted=1; </syntaxhighlight> === Count the number of editions on a day === <syntaxhighlight lang=mysql> SELECT COUNT(*) FROM revision where rev_timestamp like '20150531%'; </syntaxhighlight> izepc0bbhzlqfc2xay3pzqrf3ya9r1r Server Admin Log 0 7919 2418625 2418614 2026-05-24T02:00:34Z Stashbot 7414 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image 2418625 wikitext text/x-wiki == 2026-05-24 == * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image == 2026-05-23 == * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s) * 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image == 2026-05-22 == * 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002 * 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002 * 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002 * 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002 * 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]] * 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]] * 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 17:34 topranks: enable ttl protection on esams CRs IBGP session * 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session * 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet * 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002" * 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002" * 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet * 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox * 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet * 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet * 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002" * 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002" * 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet * 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox * 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet * 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet * 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply * 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply * 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet * 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet * 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet * 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts * 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]] * 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts * 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp * 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet * 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet * 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia * 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet * 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp * 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp * 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet * 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . * 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . * 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet * 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply * 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply * 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C * 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp * 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp * 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet * 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed * 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet * 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C * 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017 * 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed * 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet * 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp * 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet * 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet * 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C * 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C * 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet * 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply * 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply * 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed * 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet * 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]] * 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie * 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie * 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet * 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed * 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet * 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage * 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie * 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet * 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage * 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp * 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp * 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet * 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage * 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie * 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage * 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage * 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage * 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet * 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet * 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie * 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet * 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet * 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie * 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie * 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet * 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie * 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp * 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp * 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet * 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie * 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply * 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply * 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet * 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp * 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors * 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003" * 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors * 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003" * 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp * 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet * 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply * 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply * 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet * 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet * 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp * 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A * 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp * 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet * 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A * 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet * 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057 * 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057 * 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp * 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp * 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet * 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet * 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp * 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet * 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet * 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org * 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org * 05:25 marostegui@dns1004: END - running authdns-update * 05:24 marostegui@dns1004: START - running authdns-update * 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]] * 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot * 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot * 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet * 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet == 2026-05-21 == * 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s) * 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment * 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified * 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] * 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie * 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage * 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage * 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply * 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply * 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie * 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase * 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 18:53 papaul: rebooting msw1-codfw * 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply * 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply * 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply * 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply * 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply * 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply * 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply * 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply * 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply * 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply * 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply * 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply * 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply * 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply * 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply * 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply * 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply * 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply * 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply * 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply * 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply * 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply * 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply * 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028 * 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply * 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down * 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028 * 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply * 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply * 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply * 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply * 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet * 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet * 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply * 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply * 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply * 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply * 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply * 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply * 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply * 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json * 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply * 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply * 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply * 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply * 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply * 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029 * 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031 * 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031 * 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029 * 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028 * 16:55 papaul: rebooting msw-d3-codfw * 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028 * 16:52 papaul: rebooting msw-c7-codfw * 16:51 papaul: rebooting msw-c6-codfw * 16:48 papaul: rebooting msw-b7-codfw * 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet * 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet * 16:43 papaul: rebooting msw-b6-codfw * 16:40 papaul: rebooting msw-a1-codfw * 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031 * 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet * 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031 * 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031 * 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031 * 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031 * 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030 * 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031 * 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030 * 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029 * 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028 * 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002" * 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002" * 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox * 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables * 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables * 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling * 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json * 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling * 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json * 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json * 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance * 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json * 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json * 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet * 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet * 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]] * 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet * 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json * 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet * 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet * 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet * 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet * 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet * 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet * 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master * 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet * 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors * 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors * 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s) * 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json * 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet * 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet * 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors * 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors * 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master * 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment * 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet * 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki * 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]] * 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet * 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet * 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] * 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet * 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet * 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet * 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet * 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad) * 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet * 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet * 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet * 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json * 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance * 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json * 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet * 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed * 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet * 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet * 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet * 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet * 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad) * 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet * 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad * 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet * 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet * 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json * 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet * 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors * 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors * 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet * 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet * 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet * 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet * 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet * 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet * 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet * 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json * 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors * 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors * 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki * 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet * 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet * 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet * 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet * 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet * 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet * 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json * 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet * 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet * 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet * 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet * 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet * 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet * 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet * 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet * 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance * 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet * 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json * 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance * 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json * 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet * 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet * 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet * 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet * 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed * 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet * 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet * 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet * 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie * 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply * 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet * 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json * 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet * 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet * 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet * 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet * 13:51 Lucas_WMDE: UTC afternoon backport+config window done * 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s) * 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json * 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet * 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply * 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment * 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply * 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes * 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage * 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply * 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] * 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply * 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet * 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet * 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet * 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s) * 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply * 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage * 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet * 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance * 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json * 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet * 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance * 13:37 dbrant@deploy1003: dbrant: Continuing with deployment * 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet * 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet * 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] * 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet * 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet * 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet * 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s) * 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json * 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance * 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json * 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet * 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet * 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet * 13:27 sbisson@deploy1003: sbisson: Continuing with deployment * 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance * 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet * 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet * 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie * 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet * 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet * 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] * 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet * 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet * 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet * 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json * 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet * 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s) * 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet * 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie * 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet * 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling * 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet * 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet * 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment * 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet * 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet * 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . * 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . * 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json * 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet * 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' . * 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' . * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' . * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' . * 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' . * 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' . * 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] * 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet * 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp * 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet * 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json * 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]] * 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet * 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet * 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet * 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json * 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet * 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage * 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet * 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet * 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json * 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]] * 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet * 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet * 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp * 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet * 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet * 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage * 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet * 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad * 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet * 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp * 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet * 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json * 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance * 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling * 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet * 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s) * 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json * 12:39 kharlan@deploy1003: kharlan: Continuing with deployment * 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet * 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet * 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie * 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp * 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] * 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie * 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]]) * 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet * 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet * 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply * 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply * 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet * 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json * 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance * 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json * 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply * 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply * 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker * 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet * 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet * 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet * 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply * 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply * 12:21 moritzm: installing nginx security updates * 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet * 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance * 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage * 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet * 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance * 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance * 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance * 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance * 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance * 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json * 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet * 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet * 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage * 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet * 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet * 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet * 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet * 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json * 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet * 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet * 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet * 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet * 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet * 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet * 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie * 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet * 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org * 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json * 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet * 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org * 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}} * 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json * 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance * 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet * 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet * 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json * 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet * 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet * 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet * 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org * 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet * 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet * 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet * 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . * 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet * 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org * 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet * 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet * 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json * 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet * 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet * 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet * 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet * 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet * 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet * 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet * 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet * 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker * 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet * 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet * 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet * 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet * 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet * 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet * 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet * 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json * 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet * 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet * 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet * 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet * 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet * 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet * 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet * 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet * 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet * 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json * 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet * 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad * 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet * 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet * 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json * 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance * 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json * 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet * 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet * 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad * 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet * 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet * 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw * 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet * 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage * 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json * 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet * 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet * 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet * 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage * 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw * 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet * 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet * 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json * 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet * 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet * 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s) * 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet * 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 10:39 jiji@deploy1003: jiji: Continuing with deployment * 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json * 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] * 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet * 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet * 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet * 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . * 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet * 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices * 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' . * 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json * 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance * 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet * 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json * 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet * 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet * 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet * 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet * 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet * 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet * 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json * 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' . * 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet * 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet * 10:12 moritzm: installing postgresql security updates * 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet * 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet * 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org * 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . * 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet * 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet * 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet * 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet * 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json * 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet * 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet * 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet * 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet * 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet * 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org * 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet * 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org * 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet * 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw * 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet * 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet * 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet * 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet * 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet * 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json * 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet * 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org * 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet * 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet * 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply * 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet * 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet * 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply * 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet * 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet * 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet * 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet * 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet * 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json * 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet * 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance * 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json * 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet * 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]] * 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet * 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet * 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet * 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet * 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad * 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet * 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet * 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet * 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet * 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet * 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet * 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet * 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet * 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet * 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet * 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet * 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet * 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet * 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet * 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json * 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet * 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet * 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet * 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet * 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet * 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet * 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet * 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet * 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet * 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet * 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet * 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet * 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet * 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet * 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet * 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet * 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet * 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet * 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet * 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet * 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet * 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad * 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw * 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet * 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet * 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json * 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet * 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet * 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet * 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet * 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet * 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw * 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A * 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet * 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet * 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet * 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad * 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet * 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet * 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet * 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet * 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw * 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A * 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]] * 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json * 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet * 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet * 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet * 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet * 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling * 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet * 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet * 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json * 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance * 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet * 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet * 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet * 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed * 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet * 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling * 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json * 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] * 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json * 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance * 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed * 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet * 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie * 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 07:51 marostegui@dns1004: END - running authdns-update * 07:50 marostegui@dns1004: START - running authdns-update * 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]] * 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet * 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet * 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain * 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain * 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage * 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd * 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage * 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet * 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd * 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie * 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet * 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet * 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain * 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain * 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting * 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd * 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd * 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain * 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain * 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd * 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org * 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org * 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet * 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org * 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org * 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet * 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd * 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003 * 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003 * 06:15 marostegui@dns1004: END - running authdns-update * 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]] * 06:13 marostegui@dns1004: START - running authdns-update * 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json * 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json * 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json * 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning * 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip * 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s) * 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet * 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet * 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 == 2026-05-20 == * 23:32 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] (duration: 06m 37s) * 23:28 ladsgroup@deploy1003: ladsgroup, jdlrobson: Continuing with deployment * 23:27 ladsgroup@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] * 23:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] (duration: 08m 35s) * 23:14 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment * 23:12 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:10 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] * 23:07 Amir1: wikiadmin2023@10.64.48.159(svwiki)> delete from user_properties where up_value = '2' and up_property = 'thumbsize'; Query OK, 215 rows affected (0.018 sec) ([[phab:T426880|T426880]] and [[phab:T376152|T376152]]) * 23:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] (duration: 07m 09s) * 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Continuing with deployment * 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] * 22:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2027.codfw.wmnet * 22:36 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2027.codfw.wmnet * 22:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2026.codfw.wmnet * 22:33 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 22:29 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2026.codfw.wmnet * 22:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2017.codfw.wmnet * 22:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2017.codfw.wmnet * 22:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2020.codfw.wmnet * 21:59 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 21:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6009.drmrs.wmnet * 21:58 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 21:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6001.drmrs.wmnet * 21:58 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2020.codfw.wmnet * 21:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2019.codfw.wmnet * 21:55 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs * 21:52 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2019.codfw.wmnet * 21:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2018.codfw.wmnet * 21:51 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox) * 21:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org * 21:48 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 21:47 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 21:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2018.codfw.wmnet * 21:40 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1026.eqiad.wmnet * 21:37 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org * 21:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1026.eqiad.wmnet * 21:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1025.eqiad.wmnet * 21:27 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1025.eqiad.wmnet * 21:26 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_ulsfo and A:cp * 21:26 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4052.ulsfo.wmnet * 21:22 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org * 21:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1022.eqiad.wmnet * 21:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2012.codfw.wmnet with OS bullseye * 21:14 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1022.eqiad.wmnet * 21:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1021.eqiad.wmnet * 21:08 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org * 21:07 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1021.eqiad.wmnet * 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1020.eqiad.wmnet * 21:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1020.eqiad.wmnet * 21:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1019.eqiad.wmnet * 21:00 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] (duration: 10m 45s) * 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage * 20:54 dancy@deploy1003: codenamenoreste, dancy: Continuing with deployment * 20:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1019.eqiad.wmnet * 20:53 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1018.eqiad.wmnet * 20:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage * 20:53 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org * 20:52 dancy@deploy1003: codenamenoreste, dancy: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:49 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] * 20:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4051.ulsfo.wmnet * 20:46 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1018.eqiad.wmnet * 20:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1017.eqiad.wmnet * 20:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1017.eqiad.wmnet * 20:39 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1016.eqiad.wmnet * 20:37 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 20:36 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org * 20:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2012.codfw.wmnet with OS bullseye * 20:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 20:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1016.eqiad.wmnet * 20:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1015.eqiad.wmnet * 20:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1015.eqiad.wmnet * 20:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1014.eqiad.wmnet * 20:21 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org * 20:20 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1014.eqiad.wmnet * 20:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1013.eqiad.wmnet * 20:19 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 20:18 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 20:16 dwisehaupt@dns1005: END - running authdns-update * 20:16 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 20:15 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 20:15 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Maintenance * 20:15 dwisehaupt@dns1005: START - running authdns-update * 20:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1013.eqiad.wmnet * 20:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1012.eqiad.wmnet * 20:13 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp * 20:13 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4044.ulsfo.wmnet * 20:12 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 20:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 20:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4050.ulsfo.wmnet * 20:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1012.eqiad.wmnet * 20:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1011.eqiad.wmnet * 20:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org * 20:02 brouberol@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 20:01 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 20:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1011.eqiad.wmnet * 19:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org * 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet * 19:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org * 19:39 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet * 19:39 ejegg@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] (duration: 30m 19s) * 19:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4043.ulsfo.wmnet * 19:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4049.ulsfo.wmnet * 19:27 ejegg@deploy1003: ejegg: Continuing with deployment * 19:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org * 19:26 ejegg@deploy1003: ejegg: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 19:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org * 19:09 ejegg@deploy1003: Started scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] * 19:03 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 19:02 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 18:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org * 18:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4042.ulsfo.wmnet * 18:49 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] (duration: 07m 28s) * 18:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4048.ulsfo.wmnet * 18:45 reedy@deploy1003: reedy: Continuing with deployment * 18:43 reedy@deploy1003: reedy: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 18:41 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] * 18:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org * 18:38 dwisehaupt@dns1004: END - running authdns-update * 18:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 18:36 dwisehaupt@dns1004: START - running authdns-update * 18:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org * 18:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org * 18:09 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4041.ulsfo.wmnet * 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4047.ulsfo.wmnet * 18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2025.codfw.wmnet * 18:01 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]] * 18:01 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]] * 17:57 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2025.codfw.wmnet * 17:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2022.codfw.wmnet * 17:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org * 17:56 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6 * 17:56 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6 * 17:51 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2022.codfw.wmnet * 17:51 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2021.codfw.wmnet * 17:45 swfrench@deploy1003: Finished scap sync-world: Rebuild to pick up new production image (duration: 28m 32s) * 17:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2021.codfw.wmnet * 17:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2015.codfw.wmnet * 17:43 hnowlan: disabled puppet on grafana* to temporarily fix file ownership issue on /etc/grafana/provisioning/plugins/mahendrapaipuri-dashboardreporter-app.yaml * 17:42 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org * 17:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2015.codfw.wmnet * 17:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2014.codfw.wmnet * 17:32 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2014.codfw.wmnet * 17:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2013.codfw.wmnet * 17:28 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs * 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-codfw * 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2003.codfw.wmnet * 17:28 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2003.codfw.wmnet * 17:27 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org * 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet * 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4040.ulsfo.wmnet * 17:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2013.codfw.wmnet * 17:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2012.codfw.wmnet * 17:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2003.codfw.wmnet * 17:20 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad * 17:19 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2012.codfw.wmnet * 17:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2011.codfw.wmnet * 17:19 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org * 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2003.codfw.wmnet * 17:17 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2002.codfw.wmnet * 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2002.codfw.wmnet * 17:17 swfrench@deploy1003: Started scap sync-world: Rebuild to pick up new production image * 17:14 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6 with reason: switch refresh * 17:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2011.codfw.wmnet * 17:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2010.codfw.wmnet * 17:10 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2002.codfw.wmnet * 17:10 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt * 17:10 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt * 17:08 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 17:07 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica * 17:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2010.codfw.wmnet * 17:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2008.codfw.wmnet * 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2002.codfw.wmnet * 17:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2001.codfw.wmnet * 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2001.codfw.wmnet * 17:04 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org * 17:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica * 17:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica * 17:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2008.codfw.wmnet * 17:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2007.codfw.wmnet * 16:59 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2001.codfw.wmnet * 16:58 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica * 16:54 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica * 16:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2007.codfw.wmnet * 16:53 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2001.codfw.wmnet * 16:53 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-codfw * 16:50 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica * 16:49 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org * 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet * 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet * 16:43 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad * 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1325-1327].eqiad.wmnet * 16:39 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1325-1327].eqiad.wmnet * 16:38 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply * 16:37 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply * 16:37 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_ulsfo and A:cp * 16:36 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp * 16:34 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org * 16:33 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev * 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1325-1327].eqiad.wmnet * 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1325-1327].eqiad.wmnet * 16:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1322-1324].eqiad.wmnet * 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1322-1324].eqiad.wmnet * 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-eqiad * 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1028.eqiad.wmnet * 16:29 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1028.eqiad.wmnet * 16:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet * 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1322-1324].eqiad.wmnet * 16:20 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org * 16:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1322-1324].eqiad.wmnet * 16:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1318-1321].eqiad.wmnet * 16:19 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1318-1321].eqiad.wmnet * 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet * 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1257: Migration of db1257.eqiad.wmnet completed * 16:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1318-1321].eqiad.wmnet * 16:10 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt with reason: switch refresh * 16:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1318-1321].eqiad.wmnet * 16:10 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] (duration: 09m 12s) * 16:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1314-1317].eqiad.wmnet * 16:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1314-1317].eqiad.wmnet * 16:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet * 16:07 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1015.eqiad.wmnet with OS trixie * 16:07 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt * 16:07 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt * 16:05 urbanecm@deploy1003: urbanecm, mszwarc: Continuing with deployment * 16:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org * 16:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev * 16:02 urbanecm@deploy1003: urbanecm, mszwarc: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 16:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1314-1317].eqiad.wmnet * 16:00 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] * 16:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1314-1317].eqiad.wmnet * 15:59 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1310-1313].eqiad.wmnet * 15:59 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1310-1313].eqiad.wmnet * 15:59 btullis@cumin1003: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema * 15:57 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=eqiad * 15:56 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet * 15:52 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1027.eqiad.wmnet * 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1027.eqiad.wmnet * 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1023.eqiad.wmnet * 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1024.eqiad.wmnet * 15:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org * 15:51 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox) * 15:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1310-1313].eqiad.wmnet * 15:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox) * 15:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org * 15:50 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 15:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage * 15:49 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1310-1313].eqiad.wmnet * 15:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet * 15:48 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306-1309].eqiad.wmnet * 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306-1309].eqiad.wmnet * 15:46 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp[2041-2042].codfw.wmnet * 15:46 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 15:46 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002" * 15:46 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1027.eqiad.wmnet * 15:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002" * 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1027.eqiad.wmnet * 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1023.eqiad.wmnet * 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1026.eqiad.wmnet * 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1026.eqiad.wmnet * 15:45 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1024.eqiad.wmnet * 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet * 15:44 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=eqiad * 15:44 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw * 15:44 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage * 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2024.codfw.wmnet * 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2023.codfw.wmnet * 15:43 btullis@cumin1003: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema * 15:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org * 15:41 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox) * 15:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2016.codfw.wmnet * 15:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1306-1309].eqiad.wmnet * 15:38 brett@cumin2002: START - Cookbook sre.dns.netbox * 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1026.eqiad.wmnet * 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2024.codfw.wmnet * 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2023.codfw.wmnet * 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1306-1309].eqiad.wmnet * 15:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1302-1305].eqiad.wmnet * 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1302-1305].eqiad.wmnet * 15:35 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2016.codfw.wmnet * 15:32 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2041-2042].codfw.wmnet * 15:32 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply * 15:30 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply * 15:30 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1015.eqiad.wmnet with OS trixie * 15:29 moritzm: failover Ganeti master in codfw02 to ganeti2033 * 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1302-1305].eqiad.wmnet * 15:27 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99) * 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1257: Migration of db1257.eqiad.wmnet completed * 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1302-1305].eqiad.wmnet * 15:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1298-1301].eqiad.wmnet * 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1298-1301].eqiad.wmnet * 15:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 15:25 bking@cumin2002: START - Cookbook sre.wdqs.reboot * 15:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1257.eqiad.wmnet with OS trixie * 15:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 15:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1014.eqiad.wmnet with OS trixie * 15:20 hashar: Restarted Jenkins CI due to Java upgrade which causes integration/pipelinelib to not be loadable. * 15:17 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 15:15 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1298-1301].eqiad.wmnet * 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet * 15:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet * 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1298-1301].eqiad.wmnet * 15:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1294-1297].eqiad.wmnet * 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1294-1297].eqiad.wmnet * 15:09 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw * 15:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage * 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1026.eqiad.wmnet * 15:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1025.eqiad.wmnet * 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1025.eqiad.wmnet * 15:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet * 15:04 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes * 15:04 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage * 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1294-1297].eqiad.wmnet * 15:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage * 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet * 15:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1294-1297].eqiad.wmnet * 15:00 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet * 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet * 14:57 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart * 14:57 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage * 14:56 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1025.eqiad.wmnet * 14:56 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1025.eqiad.wmnet * 14:54 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1024.eqiad.wmnet * 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1024.eqiad.wmnet * 14:53 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 14:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet * 14:51 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:50 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:49 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1257.eqiad.wmnet with OS trixie * 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet * 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet * 14:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1002.eqiad.wmnet * 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet * 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1257: Upgrading db1257.eqiad.wmnet * 14:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1257: Upgrading db1257.eqiad.wmnet * 14:43 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt with reason: switch refresh * 14:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 14:43 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:42 moritzm: installing rsync security updates * 14:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet * 14:42 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1014.eqiad.wmnet with OS trixie * 14:42 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]] * 14:42 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]] * 14:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1002.eqiad.wmnet * 14:40 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1024.eqiad.wmnet * 14:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet * 14:39 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99) * 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet * 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet * 14:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1024.eqiad.wmnet * 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1019.eqiad.wmnet * 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1019.eqiad.wmnet * 14:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage * 14:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet * 14:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage * 14:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet * 14:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage * 14:27 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1013.eqiad.wmnet with OS trixie * 14:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage * 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad) * 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet * 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet * 14:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:22 dancy@deploy1003: Installation of scap version "4.266.0" completed for 2 hosts * 14:22 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 14:21 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 14:21 dancy@deploy1003: Installing scap version "4.266.0" for 2 host(s) * 14:19 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1019.eqiad.wmnet * 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet * 14:17 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudelastic1010.eqiad.wmnet * 14:16 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS trixie * 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1019.eqiad.wmnet * 14:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1018.eqiad.wmnet * 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1018.eqiad.wmnet * 14:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet * 14:12 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 14:11 bjensen: uploaded trixie-packaged memkeys on apt1002 * 14:10 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply * 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply * 14:09 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage * 14:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1018.eqiad.wmnet * 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet * 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet * 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Migration of db1258.eqiad.wmnet completed * 14:04 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage * 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1018.eqiad.wmnet * 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1017.eqiad.wmnet * 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1017.eqiad.wmnet * 14:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1010.eqiad.wmnet * 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet * 14:00 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad) * 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1017.eqiad.wmnet * 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1017.eqiad.wmnet * 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1016.eqiad.wmnet * 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1016.eqiad.wmnet * 13:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp * 13:53 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet * 13:53 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage * 13:52 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes * 13:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1011.eqiad.wmnet * 13:49 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage * 13:49 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1013.eqiad.wmnet with OS trixie * 13:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1016.eqiad.wmnet * 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1016.eqiad.wmnet * 13:48 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1015.eqiad.wmnet * 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1015.eqiad.wmnet * 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet * 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet * 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet * 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet * 13:43 Lucas_WMDE: UTC afternoon backport+config window done * 13:43 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1011.eqiad.wmnet * 13:43 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] (duration: 06m 39s) * 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1015.eqiad.wmnet * 13:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet * 13:41 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1012.eqiad.wmnet with OS trixie * 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1015.eqiad.wmnet * 13:41 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1014.eqiad.wmnet * 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1014.eqiad.wmnet * 13:38 reedy@deploy1003: reedy: Continuing with deployment * 13:38 moritzm: installing krb5 security updates * 13:38 reedy@deploy1003: reedy: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet * 13:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet * 13:37 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1056 * 13:36 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] * 13:35 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1056 * 13:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1012.eqiad.wmnet * 13:31 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2183 * 13:31 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2183 * 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet * 13:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet * 13:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 13:29 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:28 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet * 13:27 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2183 * 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 13:27 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:27 root@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003" * 13:26 root@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003" * 13:26 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] (duration: 07m 26s) * 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A * 13:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 13:24 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 13:24 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1014.eqiad.wmnet * 13:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage * 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet * 13:22 root@cumin1003: START - Cookbook sre.dns.netbox * 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet * 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1014.eqiad.wmnet * 13:22 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment * 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1013.eqiad.wmnet * 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1013.eqiad.wmnet * 13:20 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 13:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:19 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 13:19 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:19 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Migration of db1258.eqiad.wmnet completed * 13:19 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage * 13:18 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2183 * 13:18 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] * 13:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 13:18 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply * 13:18 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS trixie * 13:18 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:18 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.update-replication (exit_code=97) * 13:17 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A * 13:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1258.eqiad.wmnet with OS trixie * 13:17 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A * 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A * 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet * 13:15 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] (duration: 07m 55s) * 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1013.eqiad.wmnet * 13:15 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1013.eqiad.wmnet * 13:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1012.eqiad.wmnet * 13:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1012.eqiad.wmnet * 13:13 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply * 13:12 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet * 13:11 sbisson@deploy1003: sbisson: Continuing with deployment * 13:09 sbisson@deploy1003: sbisson: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply * 13:07 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] * 13:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet * 13:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply * 13:04 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1012.eqiad.wmnet with OS trixie * 13:03 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1012.eqiad.wmnet * 13:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply * 13:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp * 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet * 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet * 13:01 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] (duration: 09m 31s) * 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1012.eqiad.wmnet * 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1011.eqiad.wmnet * 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1011.eqiad.wmnet * 13:00 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply * 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage * 12:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp * 12:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet * 12:58 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet * 12:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet * 12:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet * 12:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet * 12:57 kharlan@deploy1003: kharlan: Continuing with deployment * 12:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet * 12:56 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet * 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1288].eqiad.wmnet * 12:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1288].eqiad.wmnet * 12:55 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie * 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1011.eqiad.wmnet * 12:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage * 12:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet * 12:52 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] * 12:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1011.eqiad.wmnet * 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1010.eqiad.wmnet * 12:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1010.eqiad.wmnet * 12:48 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[2183-2184].codfw.wmnet with reason: restart * 12:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1285-1288].eqiad.wmnet * 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1285-1288].eqiad.wmnet * 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1281-1284].eqiad.wmnet * 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1281-1284].eqiad.wmnet * 12:44 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org * 12:43 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1010.eqiad.wmnet * 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet * 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1010.eqiad.wmnet * 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet * 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1009.eqiad.wmnet * 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1009.eqiad.wmnet * 12:42 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1011.eqiad.wmnet with OS trixie * 12:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 12:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 12:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1281-1284].eqiad.wmnet * 12:37 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 12:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet * 12:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1258.eqiad.wmnet with OS trixie * 12:36 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org * 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1009.eqiad.wmnet * 12:35 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org * 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1281-1284].eqiad.wmnet * 12:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1277-1280].eqiad.wmnet * 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1277-1280].eqiad.wmnet * 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet * 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet * 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 12:33 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99) * 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 12:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99) * 12:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 12:26 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org * 12:26 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] (duration: 08m 37s) * 12:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet * 12:25 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage * 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1277-1280].eqiad.wmnet * 12:22 kharlan@deploy1003: kharlan: Continuing with deployment * 12:22 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1277-1280].eqiad.wmnet * 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1273-1276].eqiad.wmnet * 12:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1273-1276].eqiad.wmnet * 12:19 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:17 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] * 12:17 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage * 12:16 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet * 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet * 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet * 12:15 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes * 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1273-1276].eqiad.wmnet * 12:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1273-1276].eqiad.wmnet * 12:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1269-1272].eqiad.wmnet * 12:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1269-1272].eqiad.wmnet * 12:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet * 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2162: Migration of db2162.codfw.wmnet completed * 12:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet * 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1009.eqiad.wmnet * 12:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1008.eqiad.wmnet * 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1008.eqiad.wmnet * 12:04 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad * 12:04 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp * 12:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1269-1272].eqiad.wmnet * 12:02 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1011.eqiad.wmnet with OS trixie * 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1269-1272].eqiad.wmnet * 12:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1265-1268].eqiad.wmnet * 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1265-1268].eqiad.wmnet * 11:59 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp * 11:59 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet * 11:58 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] (duration: 07m 04s) * 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet * 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet * 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99) * 11:55 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99) * 11:54 mszwarc@deploy1003: mszwarc: Continuing with deployment * 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1265-1268].eqiad.wmnet * 11:53 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 11:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet * 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1008.eqiad.wmnet * 11:51 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] * 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1265-1268].eqiad.wmnet * 11:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1261-1264].eqiad.wmnet * 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1261-1264].eqiad.wmnet * 11:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1008.eqiad.wmnet * 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1007.eqiad.wmnet * 11:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1007.eqiad.wmnet * 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet * 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet * 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet * 11:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet * 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet * 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 11:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1007.eqiad.wmnet * 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet * 11:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1261-1264].eqiad.wmnet * 11:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet * 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet * 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet * 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie * 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003" * 11:39 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003" * 11:39 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1010.eqiad.wmnet with OS trixie * 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1261-1264].eqiad.wmnet * 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1257-1260].eqiad.wmnet * 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1257-1260].eqiad.wmnet * 11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet * 11:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet * 11:31 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1257-1260].eqiad.wmnet * 11:29 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1257-1260].eqiad.wmnet * 11:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1253-1256].eqiad.wmnet * 11:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1253-1256].eqiad.wmnet * 11:25 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A * 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A * 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2162: Migration of db2162.codfw.wmnet completed * 11:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage * 11:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2162.codfw.wmnet with OS trixie * 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet * 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet * 11:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1253-1256].eqiad.wmnet * 11:20 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A * 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A * 11:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage * 11:19 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1055 * 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1253-1256].eqiad.wmnet * 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1249-1252].eqiad.wmnet * 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1249-1252].eqiad.wmnet * 11:18 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1055 * 11:17 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet * 11:17 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage * 11:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet * 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet * 11:16 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage * 11:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet * 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1007.eqiad.wmnet * 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1006.eqiad.wmnet * 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1006.eqiad.wmnet * 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet * 11:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1249-1252].eqiad.wmnet * 11:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp * 11:06 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1249-1252].eqiad.wmnet * 11:05 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp * 11:05 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7014.magru.wmnet * 11:05 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw * 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2011.codfw.wmnet * 11:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2011.codfw.wmnet * 11:05 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie * 11:05 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad * 11:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: host reimage * 11:04 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes * 11:01 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw * 11:01 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1010.eqiad.wmnet with OS trixie * 11:01 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2011.codfw.wmnet * 11:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: host reimage * 10:59 moritzm: failover Ganeti cluster in codfw to ganeti2048 * 10:57 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes * 10:56 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2011.codfw.wmnet * 10:56 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2010.codfw.wmnet * 10:55 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2010.codfw.wmnet * 10:53 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1006.eqiad.wmnet * 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1006.eqiad.wmnet * 10:51 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1005.eqiad.wmnet * 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1005.eqiad.wmnet * 10:51 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . * 10:51 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2010.codfw.wmnet * 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2010.codfw.wmnet * 10:46 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2009.codfw.wmnet * 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2009.codfw.wmnet * 10:44 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage * 10:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2162.codfw.wmnet with OS trixie * 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet * 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet * 10:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2009.codfw.wmnet * 10:40 fnegri@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage * 10:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2162: Upgrading db2162.codfw.wmnet * 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2162: Upgrading db2162.codfw.wmnet * 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2009.codfw.wmnet * 10:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2008.codfw.wmnet * 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2008.codfw.wmnet * 10:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet * 10:32 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1005.eqiad.wmnet * 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2242: Migration of db2242.codfw.wmnet completed * 10:31 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2008.codfw.wmnet * 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1005.eqiad.wmnet * 10:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet * 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1004.eqiad.wmnet * 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1004.eqiad.wmnet * 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie * 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet * 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet * 10:27 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie * 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet * 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1004.eqiad.wmnet * 10:23 slyngshede@dns1004: END - running authdns-update * 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1004.eqiad.wmnet * 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1003.eqiad.wmnet * 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1003.eqiad.wmnet * 10:22 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw * 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet * 10:21 slyngshede@dns1004: START - running authdns-update * 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2008.codfw.wmnet * 10:20 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2007.codfw.wmnet * 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2007.codfw.wmnet * 10:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage * 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet * 10:14 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2007.codfw.wmnet * 10:13 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance * 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage * 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet * 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp * 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet * 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1003.eqiad.wmnet * 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1003.eqiad.wmnet * 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1002.eqiad.wmnet * 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1002.eqiad.wmnet * 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2007.codfw.wmnet * 10:09 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2006.codfw.wmnet * 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2006.codfw.wmnet * 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet * 10:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet * 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dborch1002.wikimedia.org with reason: Reboot * 10:04 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1002.eqiad.wmnet * 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1002.eqiad.wmnet * 10:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1001.eqiad.wmnet * 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1001.eqiad.wmnet * 10:02 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie * 10:02 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2006.codfw.wmnet * 09:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T415165|T415165]] * 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2006.codfw.wmnet * 09:57 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet * 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet * 09:56 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet * 09:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1001.eqiad.wmnet * 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 09:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 09:52 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet * 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet * 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet * 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet * 09:47 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2004.codfw.wmnet * 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2004.codfw.wmnet * 09:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2242: Migration of db2242.codfw.wmnet completed * 09:44 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply * 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply * 09:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2242.codfw.wmnet with OS trixie * 09:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet * 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2004.codfw.wmnet * 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet * 09:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet * 09:37 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet * 09:37 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet * 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2004.codfw.wmnet * 09:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2003.codfw.wmnet * 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2003.codfw.wmnet * 09:33 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet * 09:33 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet * 09:32 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2003.codfw.wmnet * 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2031.codfw.wmnet * 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2031.codfw.wmnet * 09:30 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet * 09:30 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet * 09:28 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes * 09:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp * 09:28 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet * 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org * 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2003.codfw.wmnet * 09:27 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2002.codfw.wmnet * 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2002.codfw.wmnet * 09:26 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1001.eqiad.wmnet * 09:26 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-eqiad * 09:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet * 09:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2242.codfw.wmnet with reason: host reimage * 09:23 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet * 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org * 09:19 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2242.codfw.wmnet with reason: host reimage * 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org * 09:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet * 09:18 moritzm: temporarily drop ganeti2030 from the codfw cluster [[phab:T426199|T426199]] * 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet * 09:15 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2002.codfw.wmnet * 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: Migration of db2187.codfw.wmnet completed * 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org * 09:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 09:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5004.wikimedia.org * 09:07 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12389 * 09:06 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 12389 * 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 207947 * 09:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2002.codfw.wmnet * 09:04 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 09:04 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet * 09:04 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet * 09:04 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 207947 * 09:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2242.codfw.wmnet with OS trixie * 09:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2242: Upgrading db2242.codfw.wmnet * 09:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2242: Upgrading db2242.codfw.wmnet * 09:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 09:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5004.wikimedia.org * 08:59 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet * 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4004.wikimedia.org * 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply * 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply * 08:53 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet * 08:53 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw * 08:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4004.wikimedia.org * 08:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet * 08:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply * 08:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply * 08:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp * 08:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2187: Migration of db2187.codfw.wmnet completed * 08:26 moritzm: installing Java 11 security updates * 08:26 hashar@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] * 08:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2187.codfw.wmnet with OS trixie * 08:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org * 08:24 moritzm: imported openjdk-8u492-ga-1~deb11u1 to component/jdk8 for bookworm (forward port of latest Java 8 security release) * 08:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2006-2008].codfw.wmnet with reason: Reboot * 08:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2005-2008].codfw.wmnet with reason: Reboot * 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org * 08:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1029.eqiad.wmnet with reason: Reboot * 08:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1026.eqiad.wmnet with reason: Reboot * 08:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet * 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org * 08:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2187.codfw.wmnet with reason: host reimage * 08:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2187.codfw.wmnet with reason: host reimage * 08:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org * 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1023.eqiad.wmnet with reason: Reboot * 07:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet * 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet * 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet * 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: Reboot * 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet * 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet * 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet * 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet * 07:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1024.eqiad.wmnet with reason: Reboot * 07:46 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet * 07:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet * 07:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet * 07:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet * 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2187.codfw.wmnet with OS trixie * 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet * 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet * 07:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet * 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet * 07:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet * 07:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host seaborgium.wikimedia.org * 07:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet * 07:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host seaborgium.wikimedia.org * 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet * 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet * 07:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet * 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet * 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet * 07:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org * 07:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1249: Repooling after boot * 07:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Upgrading db2187.codfw.wmnet * 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Upgrading db2187.codfw.wmnet * 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet * 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org * 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet * 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet * 07:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet * 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet * 07:24 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] (duration: 06m 28s) * 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet * 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet * 07:19 mlitn@deploy1003: mlitn: Continuing with deployment * 07:19 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet * 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet * 07:17 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] * 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet * 07:15 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] (duration: 10m 51s) * 07:11 mlitn@deploy1003: mlitn: Continuing with deployment * 07:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet * 07:09 moritzm: remove haveged * 07:06 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet * 07:04 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] * 06:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet * 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet * 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet * 06:46 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1249.eqiad.wmnet * 06:46 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1249.eqiad.wmnet * 06:45 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1249: Repooling after boot * 06:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet * 06:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet * 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1011.eqiad.wmnet * 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 06:34 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 06:33 moritzm: failover Ganeti cluster in drmrs02 to ganeti6002 * 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet * 06:31 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2030.codfw.wmnet * 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet * 06:31 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart * 06:27 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1011.eqiad.wmnet * 06:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart * 06:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet with reason: restart * 05:41 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1011 from dbctl [[phab:T426806|T426806]]', diff saved to https://phabricator.wikimedia.org/P92647 and previous config saved to /var/cache/conftool/dbconfig/20260520-054146-marostegui.json * 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012.eqiad.wmnet: Maintenance on pc2 * 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0) * 05:28 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1012.eqiad.wmnet: Maintenance on pc2 * 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2022.codfw.wmnet,pc[1012,1022].eqiad.wmnet with reason: Maintenance on pc1 * 02:18 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 02:18 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 02:17 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 02:17 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 02:17 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 02:17 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 01:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.* * 01:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp7011.magru.wmnet * 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie * 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003" * 01:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir7003.* * 01:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp7011.magru.wmnet * 01:23 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp7011.* * 01:10 pt1979@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003" * 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] (duration: 08m 10s) * 00:54 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage * 00:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 00:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:47 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage * 00:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] * 00:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] (duration: 06m 33s) * 00:39 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment * 00:38 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:36 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] * 00:05 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie == 2026-05-19 == * 23:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2+icu72u1 into component/php83-icu72 * 22:39 brett: disabling pybal/puppet on lvs2012 due to hardware misconfiguration/failure - [[phab:T425890|T425890]] * 22:18 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: MD RAID failure * 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2012.codfw.wmnet * 22:16 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2012.codfw.wmnet * 21:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:46 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 21:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037 * 21:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037 * 21:34 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2014.codfw.wmnet * 21:19 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037 * 21:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:18 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037 * 21:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:16 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs1036 * 21:16 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1036 * 21:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet * 21:13 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:11 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:10 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕔🍺 sudo -i reprepro -C main --ignore=wrongdistribution copy bookworm-wikimedia trixie-wikimedia cidergrinder * 21:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003" * 21:08 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003" * 21:04 jclark@cumin1003: START - Cookbook sre.dns.netbox * 20:55 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] (duration: 06m 40s) * 20:51 sbassett@deploy1003: sbassett: Continuing with deployment * 20:51 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica * 20:51 sbassett@deploy1003: sbassett: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:49 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] * 20:47 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica * 20:40 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] (duration: 08m 12s) * 20:36 ebernhardson@deploy1003: ebernhardson: Continuing with deployment * 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp * 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet * 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp * 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet * 20:34 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:32 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] * 20:29 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir * 20:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1245-1248].eqiad.wmnet * 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1245-1248].eqiad.wmnet * 20:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1241-1244].eqiad.wmnet * 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1241-1244].eqiad.wmnet * 20:16 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] (duration: 08m 25s) * 20:13 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1241-1244].eqiad.wmnet * 20:12 tgr@deploy1003: cscott, tgr: Continuing with deployment * 20:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1241-1244].eqiad.wmnet * 20:10 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet * 20:10 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet * 20:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet * 20:10 tgr@deploy1003: cscott, tgr: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:08 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] * 20:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet * 20:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet * 20:03 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet * 20:02 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet * 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet * 20:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1159-1162].eqiad.wmnet * 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1159-1162].eqiad.wmnet * 19:57 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] (duration: 16m 15s) * 19:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet * 19:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1159-1162].eqiad.wmnet * 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet * 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet * 19:53 otto@deploy1003: otto: Continuing with deployment * 19:51 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet * 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1159-1162].eqiad.wmnet * 19:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1155-1158].eqiad.wmnet * 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1155-1158].eqiad.wmnet * 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet * 19:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet * 19:44 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1155-1158].eqiad.wmnet * 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet * 19:43 otto@deploy1003: otto: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1155-1158].eqiad.wmnet * 19:42 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp * 19:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1151-1154].eqiad.wmnet * 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1151-1154].eqiad.wmnet * 19:41 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d] (duration: 02m 07s) * 19:41 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp * 19:41 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] * 19:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet * 19:39 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d] * 19:39 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d] (duration: 04m 35s) * 19:34 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d] * 19:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org * 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp * 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet * 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp * 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet * 19:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1151-1154].eqiad.wmnet * 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1151-1154].eqiad.wmnet * 19:30 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1147-1150].eqiad.wmnet * 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1147-1150].eqiad.wmnet * 19:28 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d] (duration: 02m 00s) * 19:26 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d] * 19:25 mutante: rebooting gitlab-replica-a.wikimedia.org * 19:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org * 19:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab1003.wikimedia.org with reason: [[phab:T426563|T426563]] * 19:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1147-1150].eqiad.wmnet * 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1147-1150].eqiad.wmnet * 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp * 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp * 19:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1143-1146].eqiad.wmnet * 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1143-1146].eqiad.wmnet * 19:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1143-1146].eqiad.wmnet * 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1143-1146].eqiad.wmnet * 19:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1139-1142].eqiad.wmnet * 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1139-1142].eqiad.wmnet * 19:06 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp * 19:06 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet * 19:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1139-1142].eqiad.wmnet * 19:03 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir * 19:02 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir-magru and A:ncredir * 19:02 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir * 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1139-1142].eqiad.wmnet * 19:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1135-1138].eqiad.wmnet * 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1135-1138].eqiad.wmnet * 18:55 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir * 18:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1135-1138].eqiad.wmnet * 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw * 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1135-1138].eqiad.wmnet * 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1131-1134].eqiad.wmnet * 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1131-1134].eqiad.wmnet * 18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir * 18:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1131-1134].eqiad.wmnet * 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1131-1134].eqiad.wmnet * 18:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1127-1130].eqiad.wmnet * 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1127-1130].eqiad.wmnet * 18:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1127-1130].eqiad.wmnet * 18:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1127-1130].eqiad.wmnet * 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1123-1126].eqiad.wmnet * 18:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1123-1126].eqiad.wmnet * 18:24 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet * 18:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1123-1126].eqiad.wmnet * 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1123-1126].eqiad.wmnet * 18:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1119-1122].eqiad.wmnet * 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1119-1122].eqiad.wmnet * 18:12 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp * 18:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1119-1122].eqiad.wmnet * 18:07 cjd91: import gdnsd_3.99.0-alpha3~deb13u1 into trixie-wikimedia-[[phab:T401832|T401832]] * 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1119-1122].eqiad.wmnet * 18:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1115-1118].eqiad.wmnet * 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1115-1118].eqiad.wmnet * 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet * 17:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet * 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1115-1118].eqiad.wmnet * 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1115-1118].eqiad.wmnet * 17:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet * 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet * 17:51 joal@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Hotfix Hadoop-test [analytics/refinery@eeef7f3d] * 17:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet * 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet * 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet * 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet * 17:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet * 17:35 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart * 17:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet * 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet * 17:33 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet * 17:30 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d] (duration: 00m 46s) * 17:30 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d] * 17:28 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet * 17:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 17:27 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 17:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 17:27 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 17:27 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 17:26 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 17:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet * 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet * 17:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1075-1078].eqiad.wmnet * 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1075-1078].eqiad.wmnet * 17:22 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet * 17:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1075-1078].eqiad.wmnet * 17:13 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet * 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1075-1078].eqiad.wmnet * 17:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1071-1074].eqiad.wmnet * 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1071-1074].eqiad.wmnet * 17:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1071-1074].eqiad.wmnet * 17:03 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet * 17:03 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1071-1074].eqiad.wmnet * 17:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1067-1070].eqiad.wmnet * 17:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1067-1070].eqiad.wmnet * 17:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[2004-2007].codfw.wmnet with reason: restart * 16:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1067-1070].eqiad.wmnet * 16:54 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backupmon1001.eqiad.wmnet with reason: restart * 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1067-1070].eqiad.wmnet * 16:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet * 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet * 16:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet * 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet * 16:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1053-1056].eqiad.wmnet * 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1053-1056].eqiad.wmnet * 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1053-1056].eqiad.wmnet * 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1053-1056].eqiad.wmnet * 16:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1049-1052].eqiad.wmnet * 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1049-1052].eqiad.wmnet * 16:28 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[1004-1007].eqiad.wmnet with reason: restart * 16:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1049-1052].eqiad.wmnet * 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1049-1052].eqiad.wmnet * 16:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1045-1048].eqiad.wmnet * 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1045-1048].eqiad.wmnet * 16:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1045-1048].eqiad.wmnet * 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1045-1048].eqiad.wmnet * 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1041-1044].eqiad.wmnet * 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1041-1044].eqiad.wmnet * 16:07 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 16:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie * 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1041-1044].eqiad.wmnet * 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1041-1044].eqiad.wmnet * 15:59 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1037-1040].eqiad.wmnet * 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1037-1040].eqiad.wmnet * 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet * 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet * 15:53 moritzm: temporarily drop ganeti2029 from the codfw cluster [[phab:T426199|T426199]] * 15:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1037-1040].eqiad.wmnet * 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet * 15:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet * 15:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet * 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2243: Migration of db2243.codfw.wmnet completed * 15:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and A:ncredir * 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet * 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet * 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1037-1040].eqiad.wmnet * 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet * 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet * 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet * 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet * 15:43 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet * 15:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet * 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet * 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet * 15:39 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy * 15:39 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet * 15:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet * 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet * 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet * 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet * 15:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet * 15:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet * 15:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet * 15:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet * 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet * 15:30 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet * 15:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet * 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet * 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet * 15:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet * 15:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet * 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet * 15:28 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw * 15:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet * 15:27 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 15:27 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet * 15:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet * 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet * 15:26 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet * 15:25 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1070.eqiad.wmnet * 15:25 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1070.eqiad.wmnet * 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet * 15:23 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet * 15:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet * 15:23 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 15:22 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2002.codfw.wmnet * 15:22 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 15:22 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 15:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet * 15:21 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet * 15:21 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 15:21 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 15:21 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 15:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2 into component/php83 * 15:20 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2002.codfw.wmnet * 15:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet * 15:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet * 15:19 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2003.codfw.wmnet * 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1055.eqiad.wmnet * 15:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1055.eqiad.wmnet * 15:17 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2003.codfw.wmnet * 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet * 15:16 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]] (duration: 00m 46s) * 15:16 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2001.codfw.wmnet * 15:16 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy * 15:16 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]] * 15:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet * 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet * 15:14 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]] (duration: 00m 49s) * 15:14 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet * 15:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1245.eqiad.wmnet with reason: restart * 15:14 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy * 15:14 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]] * 15:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet * 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet * 15:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet * 15:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet * 15:12 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2001.codfw.wmnet * 15:11 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet * 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-eqiad * 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1003.eqiad.wmnet * 15:09 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1003.eqiad.wmnet * 15:08 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet * 15:08 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet * 15:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet * 15:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet * 15:06 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet * 15:06 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet * 15:06 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet * 15:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] (duration: 08m 03s) * 15:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet * 15:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet * 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1003.eqiad.wmnet * 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1003.eqiad.wmnet * 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1002.eqiad.wmnet * 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1002.eqiad.wmnet * 15:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2243: Migration of db2243.codfw.wmnet completed * 15:02 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet * 15:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2243.codfw.wmnet with OS trixie * 15:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment * 15:00 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet * 15:00 moritzm: failover Ganeti cluster in drmrs01 to ganeti6001 * 15:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet * 14:59 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1002.eqiad.wmnet * 14:59 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1002.eqiad.wmnet * 14:59 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-eqiad * 14:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet * 14:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet * 14:57 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] * 14:57 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 14:56 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 14:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-codfw * 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2003.codfw.wmnet * 14:55 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2003.codfw.wmnet * 14:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 14:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet * 14:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet * 14:51 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 14:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet * 14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet * 14:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 09m 37s) * 14:51 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2003.codfw.wmnet * 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2003.codfw.wmnet * 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2002.codfw.wmnet * 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2002.codfw.wmnet * 14:47 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment * 14:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet * 14:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet * 14:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2002.codfw.wmnet * 14:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2002.codfw.wmnet * 14:45 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-codfw * 14:44 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet * 14:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet * 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-codfw * 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2009.codfw.wmnet * 14:44 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2009.codfw.wmnet * 14:44 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough * 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2243.codfw.wmnet with reason: host reimage * 14:43 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum * 14:43 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C * 14:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie * 14:42 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica * 14:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] * 14:41 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 14:40 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 07m 16s) * 14:40 dreamyjazz@deploy1003: dreamyjazz, kharlan: Rolling back deployment * 14:40 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy * 14:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet * 14:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet * 14:39 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet * 14:39 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2009.codfw.wmnet * 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2009.codfw.wmnet * 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2008.codfw.wmnet * 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2008.codfw.wmnet * 14:38 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica * 14:37 dreamyjazz@deploy1003: dreamyjazz, kharlan: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C * 14:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet * 14:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet * 14:36 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2243.codfw.wmnet with reason: host reimage * 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc1 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92606 and previous config saved to /var/cache/conftool/dbconfig/20260519-143632-marostegui.json * 14:35 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir * 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Make pc1021 master of pc1 in eqiad [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92605 and previous config saved to /var/cache/conftool/dbconfig/20260519-143549-marostegui.json * 14:35 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet * 14:34 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy * 14:33 kamila@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host deploy2002.codfw.wmnet * 14:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] * 14:33 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet * 14:33 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2008.codfw.wmnet * 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2008.codfw.wmnet * 14:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2007.codfw.wmnet * 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2007.codfw.wmnet * 14:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet * 14:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet * 14:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 14:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 14:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 14:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 14:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet * 14:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet * 14:27 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2007.codfw.wmnet * 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2007.codfw.wmnet * 14:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet * 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet * 14:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet * 14:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet * 14:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet * 14:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet * 14:21 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet * 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2243.codfw.wmnet with OS trixie * 14:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1204.eqiad.wmnet with OS trixie * 14:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet * 14:18 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet * 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2243: Upgrading db2243.codfw.wmnet * 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2243: Upgrading db2243.codfw.wmnet * 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet * 14:16 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2005.codfw.wmnet * 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2005.codfw.wmnet * 14:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet * 14:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet * 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2005.codfw.wmnet * 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2005.codfw.wmnet * 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2004.codfw.wmnet * 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2004.codfw.wmnet * 14:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet * 14:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet * 14:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet * 14:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet * 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2004.codfw.wmnet * 14:07 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and A:magru and (A:dnsbox) * 14:07 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org * 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2004.codfw.wmnet * 14:06 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2003.codfw.wmnet * 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2003.codfw.wmnet * 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet * 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet * 14:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet * 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2003.codfw.wmnet * 14:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet * 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2003.codfw.wmnet * 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2002.codfw.wmnet * 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2002.codfw.wmnet * 14:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 13:59 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[1204-1205].eqiad.wmnet with reason: restart/reimage * 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2002.codfw.wmnet * 13:57 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet * 13:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet * 13:57 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org * 13:57 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2002.codfw.wmnet * 13:57 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-codfw * 13:55 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 13:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet * 13:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet * 13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet * 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet * 13:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet * 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet * 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet * 13:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet * 13:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet * 13:46 root@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage * 13:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet * 13:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet * 13:42 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org * 13:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet * 13:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet * 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet * 13:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet * 13:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet * 13:37 Lucas_WMDE: UTC afternoon backport+config window done * 13:36 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage * 13:36 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] (duration: 12m 56s) * 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki1001.eqiad.wmnet * 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003" * 13:35 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003" * 13:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet * 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 13:32 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet * 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet * 13:32 cscott@deploy1003: cscott: Continuing with deployment * 13:31 elukey@cumin1003: START - Cookbook sre.dns.netbox * 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet * 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet * 13:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet * 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet * 13:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet * 13:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet * 13:28 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org * 13:28 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and A:magru and (A:dnsbox) * 13:28 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy * 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum * 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough * 13:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet * 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 13:25 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki1001.eqiad.wmnet * 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet * 13:25 cscott@deploy1003: cscott: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet * 13:23 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] * 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet * 13:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet * 13:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet * 13:21 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1204.eqiad.wmnet with OS trixie * 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet * 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet * 13:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet * 13:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet * 13:16 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] (duration: 07m 00s) * 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet * 13:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet * 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet * 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet * 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet * 13:14 cezmunsta: Removing db2143 from orchestrator [[phab:T424171|T424171]] * 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2143.codfw.wmnet * 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 13:13 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 13:13 cezmunsta: Removing db2143 from zarcillo [[phab:T424171|T424171]] * 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS trixie * 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet * 13:12 dbrant@deploy1003: dbrant: Continuing with deployment * 13:11 moritzm: failover Ganeti cluster in esams to ganeti3005 * 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet * 13:11 dbrant@deploy1003: dbrant: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet * 13:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS trixie * 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet * 13:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet * 13:09 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] * 13:08 cwilliams@cumin1003: START - Cookbook sre.dns.netbox * 12:52 cwilliams@cumin1003: START - Cookbook sre.dns.netbox * 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS trixie * 12:50 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet * 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2027.codfw.wmnet * 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet * 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage * 12:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet * 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage * 12:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet * 12:47 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2149.codfw.wmnet * 12:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet * 12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet * 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet * 12:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet * 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet * 12:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet * 12:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet * 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS trixie * 12:36 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage * 12:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage * 12:36 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2008.codfw.wmnet * 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS trixie * 12:35 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet * 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS trixie * 12:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet * 12:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet * 12:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet * 12:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet * 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3005.esams.wmnet * 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet * 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS trixie * 12:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS trixie * 12:28 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2008.codfw.wmnet * 12:28 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2006.codfw.wmnet * 12:27 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet * 12:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet * 12:26 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet * 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet * 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet * 12:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS trixie * 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage * 12:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie * 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet * 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS trixie * 12:20 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1279.eqiad.wmnet with OS trixie * 12:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet * 12:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet * 12:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet * 12:18 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2006.codfw.wmnet * 12:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus7002.magru.wmnet * 12:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS trixie * 12:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage * 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet * 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet * 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet * 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage * 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet * 12:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS trixie * 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage * 12:12 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet * 12:12 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus7002.magru.wmnet * 12:12 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus6002.drmrs.wmnet * 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage * 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS trixie * 12:09 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 12:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 12:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet * 12:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 12:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage * 12:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2005.codfw.wmnet * 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet * 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet * 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS trixie * 12:06 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus6002.drmrs.wmnet * 12:06 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2007.codfw.wmnet * 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS trixie * 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage * 12:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2005.codfw.wmnet * 12:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2006.codfw.wmnet * 12:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage * 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet * 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet * 11:57 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet * 11:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage * 11:56 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad * 11:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2006.codfw.wmnet * 11:56 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2007.codfw.wmnet * 11:56 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2005.codfw.wmnet * 11:53 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet * 11:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage * 11:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart * 11:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage * 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage * 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage * 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage * 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage * 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage * 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage * 11:46 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2005.codfw.wmnet * 11:46 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus5003.eqsin.wmnet * 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage * 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage * 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage * 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage * 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage * 11:42 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet * 11:39 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus5003.eqsin.wmnet * 11:39 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1008.eqiad.wmnet * 11:39 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[1003-1004].eqiad.wmnet with reason: restart * 11:37 moritzm: failover Ganeti cluster in eqsin to ganeti5004 * 11:37 moritzm: failover Ganeti cluster in magru to ganeti7001 * 11:36 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS trixie * 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS trixie * 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS trixie * 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS trixie * 11:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet * 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS trixie * 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS trixie * 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet * 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet * 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet * 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet * 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS trixie * 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS trixie * 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie * 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS trixie * 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS trixie * 11:29 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1008.eqiad.wmnet * 11:29 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet * 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet * 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet * 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS trixie * 11:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS trixie * 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet * 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet * 11:19 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet * 11:19 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus4003.ulsfo.wmnet * 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet * 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet * 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet * 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet * 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS trixie * 11:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS trixie * 11:13 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus4003.ulsfo.wmnet * 11:13 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus3004.esams.wmnet * 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage * 11:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 11:10 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet * 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet * 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS trixie * 11:07 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus3004.esams.wmnet * 11:07 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1007.eqiad.wmnet * 11:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage * 11:06 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad * 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet * 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet * 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS trixie * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet * 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage * 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS trixie * 11:00 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet * 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage * 10:59 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1007.eqiad.wmnet * 10:59 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1005.eqiad.wmnet * 10:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS trixie * 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage * 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS trixie * 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet * 10:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet * 10:53 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet * 10:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet * 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS trixie * 10:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet * 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage * 10:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1005.eqiad.wmnet * 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad) * 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1005.eqiad.wmnet * 10:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1005.eqiad.wmnet * 10:49 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet * 10:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS trixie * 10:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage * 10:46 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1005.eqiad.wmnet * 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1005.eqiad.wmnet * 10:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1004.eqiad.wmnet * 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1004.eqiad.wmnet * 10:45 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw * 10:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1005.eqiad.wmnet * 10:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet * 10:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage * 10:41 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1004.eqiad.wmnet * 10:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage * 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1004.eqiad.wmnet * 10:40 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1003.eqiad.wmnet * 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1003.eqiad.wmnet * 10:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet * 10:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1006.eqiad.wmnet * 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage * 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage * 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage * 10:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1 * 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1003.eqiad.wmnet * 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage * 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage * 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage * 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1003.eqiad.wmnet * 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1002.eqiad.wmnet * 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1002.eqiad.wmnet * 10:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage * 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage * 10:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage * 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage * 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage * 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage * 10:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1002.eqiad.wmnet * 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1006.eqiad.wmnet * 10:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet * 10:29 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage * 10:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1002.eqiad.wmnet * 10:26 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad) * 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet * 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS trixie * 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS trixie * 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS trixie * 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS trixie * 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS trixie * 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS trixie * 10:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS trixie * 10:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS trixie * 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS trixie * 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS trixie * 10:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet * 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS trixie * 10:10 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet * 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2244: Migration of db2244.codfw.wmnet completed * 10:07 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet * 10:05 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet * 10:00 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet * 10:00 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet * 09:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet * 09:58 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet * 09:55 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw * 09:53 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet * 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet * 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet * 09:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet * 09:47 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' . * 09:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet * 09:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet * 09:42 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet * 09:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 40401 * 09:38 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 40401 * 09:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2001.codfw.wmnet * 09:34 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet * 09:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1006.eqiad.wmnet * 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-eqiad * 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1009.eqiad.wmnet * 09:31 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1009.eqiad.wmnet * 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet * 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet * 09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2001.codfw.wmnet * 09:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1009.eqiad.wmnet * 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1009.eqiad.wmnet * 09:25 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1008.eqiad.wmnet * 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1008.eqiad.wmnet * 09:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet * 09:23 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet * 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2244: Migration of db2244.codfw.wmnet completed * 09:20 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1005.eqiad.wmnet * 09:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2244.codfw.wmnet with OS trixie * 09:20 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1008.eqiad.wmnet * 09:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet * 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1008.eqiad.wmnet * 09:19 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1007.eqiad.wmnet * 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1007.eqiad.wmnet * 09:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply * 09:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply * 09:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply * 09:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply * 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1007.eqiad.wmnet * 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1007.eqiad.wmnet * 09:13 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1006.eqiad.wmnet * 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1006.eqiad.wmnet * 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet * 09:13 filippo@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host cloudnet1006.eqiad.wmnet * 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet * 09:08 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1006.eqiad.wmnet * 09:07 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1006.eqiad.wmnet * 09:05 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet * 09:05 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet * 09:04 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw * 09:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2244.codfw.wmnet with reason: host reimage * 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org * 09:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2244.codfw.wmnet with reason: host reimage * 08:59 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet * 08:58 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet * 08:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org * 08:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet * 08:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet * 08:55 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] (duration: 07m 15s) * 08:52 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet * 08:51 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment * 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org * 08:50 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 08:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet * 08:49 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet * 08:49 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-eqiad * 08:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1072.eqiad.wmnet * 08:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] * 08:47 moritzm: failover Ganeti cluster in ulsfo to ganeti4008 * 08:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org * 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet * 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet * 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet * 08:44 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] (duration: 09m 08s) * 08:44 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw * 08:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2244.codfw.wmnet with OS trixie * 08:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0) * 08:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1072.eqiad.wmnet * 08:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1071.eqiad.wmnet * 08:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet * 08:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet * 08:41 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet * 08:40 kharlan@deploy1003: kharlan: Continuing with deployment * 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2244: Upgrading db2244.codfw.wmnet * 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2244: Upgrading db2244.codfw.wmnet * 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet * 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade * 08:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1071.eqiad.wmnet * 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet * 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1070.eqiad.wmnet * 08:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet * 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad * 08:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet * 08:35 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet * 08:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] * 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4007.ulsfo.wmnet * 08:35 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet * 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet * 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet * 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet * 08:33 cezmunsta: Removing db2151 from orchestrator [[phab:T424343|T424343]] * 08:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet * 08:32 cezmunsta: Removing db2151 from zarcillo [[phab:T424343|T424343]] * 08:32 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1070.eqiad.wmnet * 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2151.codfw.wmnet * 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 08:31 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 08:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet * 08:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet * 08:28 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet * 08:27 cwilliams@cumin1003: START - Cookbook sre.dns.netbox * 08:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet * 08:26 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet * 08:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster * 08:24 Emperor: reboot apus codfw frontends (May reboots) * 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4008.ulsfo.wmnet * 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet * 08:22 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2151.codfw.wmnet * 08:19 hashar@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] * 08:19 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet * 08:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0) * 08:17 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad * 08:17 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet * 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet * 08:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4008.ulsfo.wmnet * 08:10 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet * 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Repooling after switchover * 07:58 cezmunsta: Removing db2150 from orchestrator [[phab:T424342|T424342]] * 07:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster * 07:57 Emperor: reboot apus eqiad frontends (May reboots) * 07:52 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet * 07:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org * 07:50 cezmunsta: Removing db2150 from zarcillo [[phab:T424342|T424342]] * 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2150.codfw.wmnet * 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 07:48 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 07:46 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet * 07:45 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet * 07:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org * 07:43 cwilliams@cumin1003: START - Cookbook sre.dns.netbox * 07:39 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet * 07:39 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2150.codfw.wmnet * 07:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bookworm * 07:33 XioNoX: add gnmic 0.46.0 to reprepro * 07:20 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] (duration: 13m 17s) * 07:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage * 07:17 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Repooling after switchover * 07:14 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage * 07:14 mlitn@deploy1003: mlitn: Continuing with deployment * 07:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1210: Repooling after switchover * 07:13 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet * 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet * 07:07 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] * 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet * 07:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bookworm * 07:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet * 07:02 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm * 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet * 06:59 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bookworm * 06:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2212.codfw.wmnet with reason: Maintenance * 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet * 06:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db2212 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92584 and previous config saved to /var/cache/conftool/dbconfig/20260519-065637-fceratto.json * 06:54 moritzm: installing qemu security updates * 06:52 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2203 to s1 primary [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92583 and previous config saved to /var/cache/conftool/dbconfig/20260519-065224-fceratto.json * 06:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1 * 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011.eqiad.wmnet: Maintenance on pc1 * 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0) * 06:51 federico3: Starting s1 codfw failover from db2212 to db2203 - [[phab:T426703|T426703]] * 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1011.eqiad.wmnet: Maintenance on pc1 * 06:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99) * 06:50 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 06:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage * 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet * 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2203 with weight 0 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92581 and previous config saved to /var/cache/conftool/dbconfig/20260519-064500-fceratto.json * 06:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426703|T426703]] * 06:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage * 06:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet * 06:40 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage * 06:39 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage * 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2014.codfw.wmnet * 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 06:33 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 06:29 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bookworm * 06:28 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1210: Repooling after switchover * 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bookworm * 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2014.codfw.wmnet * 06:22 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2014 from dbctl [[phab:T426595|T426595]]', diff saved to https://phabricator.wikimedia.org/P92578 and previous config saved to /var/cache/conftool/dbconfig/20260519-062227-marostegui.json * 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance * 06:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1210 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92577 and previous config saved to /var/cache/conftool/dbconfig/20260519-062056-fceratto.json * 06:19 fceratto@dns1005: END - running authdns-update * 06:18 fceratto@dns1005: START - running authdns-update * 06:15 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1230 to s5 primary and set section read-write [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92576 and previous config saved to /var/cache/conftool/dbconfig/20260519-061524-fceratto.json * 06:14 fceratto@cumin1003: dbctl commit (dc=all): 'Set s5 eqiad as read-only for maintenance - [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92575 and previous config saved to /var/cache/conftool/dbconfig/20260519-061435-fceratto.json * 06:14 federico3: Starting s5 eqiad failover from db1210 to db1230 - [[phab:T426087|T426087]] * 06:09 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1230 with weight 0 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92574 and previous config saved to /var/cache/conftool/dbconfig/20260519-060929-fceratto.json * 06:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426087|T426087]] * 05:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2014.codfw.wmnet with reason: Maintenance on pc4 * 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.26 (duration: 02m 40s) * 03:41 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] (duration: 38m 23s) * 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] * 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s) * 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 02:00 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 02:00 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 02:00 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 02:00 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 02:00 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 02:00 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 00:58 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] (duration: 06m 36s) * 00:54 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 00:54 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] * 00:32 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org * 00:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] (duration: 07m 08s) * 00:26 ladsgroup@deploy1003: ladsgroup, jforrester: Continuing with deployment * 00:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org * 00:25 ladsgroup@deploy1003: ladsgroup, jforrester: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2002.wikimedia.org with reason: [[phab:T426563|T426563]] * 00:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] == 2026-05-18 == * 23:48 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet * 23:44 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet * 23:44 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on vrts1004.eqiad.wmnet with reason: [[phab:T426563|T426563]] * 23:37 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on lists2001.wikimedia.org with reason: [[phab:T426563|T426563]] * 23:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bookworm * 23:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bookworm * 23:18 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] (duration: 06m 52s) * 23:13 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 23:12 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:11 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] * 23:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage * 23:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage * 23:00 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage * 22:59 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage * 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bookworm * 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bookworm * 22:11 swfrench@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker2331.codfw.wmnet * 22:06 swfrench@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker2331.codfw.wmnet * 21:55 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2149-2154].codfw.wmnet * 21:54 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2154].codfw.wmnet * 21:45 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T426563|T426563]] * 21:44 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T426563|T426563]] * 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T426563|T426563]] * 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T426563|T426563]] * 21:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] (duration: 11m 29s) * 21:38 krinkle@deploy1003: seddon, krinkle: Continuing with deployment * 21:32 krinkle@deploy1003: seddon, krinkle: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:31 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] * 21:28 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2190.codfw.wmnet * 21:28 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2190.codfw.wmnet * 21:16 mutante: gerrit-replica.wikimedia.org back online * 21:14 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:15:00 on gerrit-replica.wikimedia.org with reason: [[phab:T426563|T426563]] * 21:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit2002.wikimedia.org with reason: [[phab:T426563|T426563]] * 21:10 mutante: gerrit-replica.wikimedia.org, gerrit-spare.wikimedia.org - rebooting backends * 21:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit1003.wikimedia.org with reason: [[phab:T426563|T426563]] * 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2317-2330].codfw.wmnet * 20:30 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2317-2330].codfw.wmnet * 20:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2317-2330].codfw.wmnet * 20:14 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2317-2330].codfw.wmnet * 20:13 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2303-2316].codfw.wmnet * 20:13 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2303-2316].codfw.wmnet * 20:05 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2303-2316].codfw.wmnet * 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2303-2316].codfw.wmnet * 19:56 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2289-2302].codfw.wmnet * 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2289-2302].codfw.wmnet * 19:48 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2289-2302].codfw.wmnet * 19:40 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2289-2302].codfw.wmnet * 19:39 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2275-2288].codfw.wmnet * 19:39 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2275-2288].codfw.wmnet * 19:29 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2275-2288].codfw.wmnet * 19:21 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2275-2288].codfw.wmnet * 19:20 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2261-2274].codfw.wmnet * 19:20 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2261-2274].codfw.wmnet * 19:12 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2261-2274].codfw.wmnet * 19:03 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2261-2274].codfw.wmnet * 19:02 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2243,2248-2260].codfw.wmnet * 19:02 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2243,2248-2260].codfw.wmnet * 18:54 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2243,2248-2260].codfw.wmnet * 18:48 jhathaway@dns1004: END - running authdns-update * 18:46 jhathaway@dns1004: START - running authdns-update * 18:45 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2243,2248-2260].codfw.wmnet * 18:45 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2203-2215,2242].codfw.wmnet * 18:44 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2203-2215,2242].codfw.wmnet * 18:38 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2003.wikimedia.org with reason: [[phab:T426563|T426563]] * 18:35 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2203-2215,2242].codfw.wmnet * 18:32 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet * 18:32 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet * 18:31 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet * 18:31 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet * 18:30 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host centrallog1002.eqiad.wmnet * 18:30 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog1002.eqiad.wmnet * 18:26 herron: rebooting alert1002 * 18:26 swfrench@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 18:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2191-2202].codfw.wmnet * 18:22 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2191-2202].codfw.wmnet * 18:20 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2003.codfw.wmnet * 18:20 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2002.codfw.wmnet * 18:19 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2189.codfw.wmnet * 18:19 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2189.codfw.wmnet * 18:16 mutante: releases.wikimedia.org - rebooting backends * 18:16 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on releases2003.codfw.wmnet with reason: [[phab:T426563|T426563]] * 18:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet * 18:13 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2002.codfw.wmnet * 18:13 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1004.eqiad.wmnet * 18:11 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet * 18:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet * 18:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet * 18:06 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1004.eqiad.wmnet * 18:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1003.eqiad.wmnet * 18:04 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 18:02 Reedy: Deployed patch for [[phab:T426631|T426631]] * 17:59 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1003.eqiad.wmnet * 17:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1002.eqiad.wmnet * 17:56 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad * 17:50 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host centrallog2002.codfw.wmnet * 17:50 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1002.eqiad.wmnet * 17:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bookworm * 17:46 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab1005.eqiad.wmnet with reason: [[phab:T426563|T426563]] * 17:46 herron: rebooting alert2002 * 17:45 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab2003.codfw.wmnet with reason: [[phab:T426563|T426563]] * 17:45 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org * 17:45 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org * 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana1002.eqiad.wmnet * 17:44 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org * 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org * 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog2002.codfw.wmnet * 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bookworm * 17:40 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana1002.eqiad.wmnet * 17:38 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1005.eqiad.wmnet * 17:37 mutante: stewards* - rebooting * 17:36 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet * 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage * 17:32 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet * 17:31 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite1005.eqiad.wmnet * 17:30 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite2004.codfw.wmnet * 17:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage * 17:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog1003.eqiad.wmnet * 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage * 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage * 17:23 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite2004.codfw.wmnet * 17:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp1001.eqiad.wmnet * 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet * 17:18 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog1003.eqiad.wmnet * 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp1001.eqiad.wmnet * 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet * 17:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp2001.codfw.wmnet * 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog2003.codfw.wmnet * 17:14 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc2003.codfw.wmnet with reason: [[phab:T426563|T426563]] * 17:14 mutante: doc.wikimedia.org - rebooting backends * 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2003.codfw.wmnet * 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest2001.codfw.wmnet * 17:13 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc1004.eqiad.wmnet with reason: [[phab:T426563|T426563]] * 17:13 topranks: restarted gnmic on netflow3004 as series missing for cr2-esams * 17:12 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest1001.eqiad.wmnet * 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bookworm * 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bookworm * 17:11 mutante: etherpad - rebooting backends * 17:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on etherpad1004.eqiad.wmnet with reason: [[phab:T426563|T426563]] * 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp2001.codfw.wmnet * 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf2003.codfw.wmnet * 17:08 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog2003.codfw.wmnet * 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest2001.codfw.wmnet * 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest1001.eqiad.wmnet * 17:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad * 17:04 mutante: contint2002, phab2002 - rebooting * 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1373-1374].eqiad.wmnet * 16:49 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1373-1374].eqiad.wmnet * 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1373-1374].eqiad.wmnet * 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1373-1374].eqiad.wmnet * 16:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1370-1372].eqiad.wmnet * 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1370-1372].eqiad.wmnet * 16:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2189-2202].codfw.wmnet * 16:37 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw * 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1370-1372].eqiad.wmnet * 16:32 mutante: zuul[12]00[123] / zuul* - rebooting * 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2189-2202].codfw.wmnet * 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet * 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet * 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1370-1372].eqiad.wmnet * 16:28 mutante: contint2003 - new jenkins - reboot for kernel upgrade * 16:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1367-1369].eqiad.wmnet * 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1367-1369].eqiad.wmnet * 16:27 mutante: people.wikimedia.org backend - rebooting * 16:22 mutante: contint1003 - rebooting * 16:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet * 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet * 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1367-1369].eqiad.wmnet * 16:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1367-1369].eqiad.wmnet * 16:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1364-1366].eqiad.wmnet * 16:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1364-1366].eqiad.wmnet * 16:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet * 16:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet * 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet * 16:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2155-2169].codfw.wmnet * 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2155-2169].codfw.wmnet * 16:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bookworm * 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1364-1366].eqiad.wmnet * 16:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bookworm * 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1364-1366].eqiad.wmnet * 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1361-1363].eqiad.wmnet * 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1361-1363].eqiad.wmnet * 16:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet * 16:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet * 16:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2155-2169].codfw.wmnet * 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1361-1363].eqiad.wmnet * 16:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1361-1363].eqiad.wmnet * 16:00 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1358-1360].eqiad.wmnet * 16:00 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1358-1360].eqiad.wmnet * 15:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage * 15:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet * 15:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet * 15:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2155-2169].codfw.wmnet * 15:54 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 15:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage * 15:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1358-1360].eqiad.wmnet * 15:53 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet * 15:52 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1358-1360].eqiad.wmnet * 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1355-1357].eqiad.wmnet * 15:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1355-1357].eqiad.wmnet * 15:50 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw * 15:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2152-2154].codfw.wmnet * 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage * 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage * 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet * 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet * 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2152-2154].codfw.wmnet * 15:48 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[2149-2151].codfw.wmnet * 15:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2151].codfw.wmnet * 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet * 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1355-1357].eqiad.wmnet * 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1355-1357].eqiad.wmnet * 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1354].eqiad.wmnet * 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1354].eqiad.wmnet * 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet * 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet * 15:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2149-2151].codfw.wmnet * 15:38 Amir1: re-mapping thumbsize of 1 to 2 in all group0 wikis ([[phab:T376152|T376152]]) * 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2149-2151].codfw.wmnet * 15:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2146-2148].codfw.wmnet * 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2146-2148].codfw.wmnet * 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bookworm * 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bookworm * 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet * 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1352-1354].eqiad.wmnet * 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1352-1354].eqiad.wmnet * 15:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1349-1351].eqiad.wmnet * 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1349-1351].eqiad.wmnet * 15:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet * 15:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet * 15:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet * 15:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2146-2148].codfw.wmnet * 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2146-2148].codfw.wmnet * 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2143-2145].codfw.wmnet * 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2143-2145].codfw.wmnet * 15:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1349-1351].eqiad.wmnet * 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1349-1351].eqiad.wmnet * 15:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1346-1348].eqiad.wmnet * 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1346-1348].eqiad.wmnet * 15:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet * 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet * 15:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet * 15:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2143-2145].codfw.wmnet * 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2143-2145].codfw.wmnet * 15:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2142].codfw.wmnet * 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2142].codfw.wmnet * 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1346-1348].eqiad.wmnet * 15:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet * 15:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1346-1348].eqiad.wmnet * 15:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1343-1345].eqiad.wmnet * 15:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1343-1345].eqiad.wmnet * 15:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet * 15:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet * 15:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1343-1345].eqiad.wmnet * 15:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2142].codfw.wmnet * 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet * 15:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1343-1345].eqiad.wmnet * 15:11 herron@cumin1003: END (FAIL) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=99) rolling reboot on A:kafka-logging-codfw * 15:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2142].codfw.wmnet * 15:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1340-1342].eqiad.wmnet * 15:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1340-1342].eqiad.wmnet * 15:10 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2137-2139].codfw.wmnet * 15:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2137-2139].codfw.wmnet * 15:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet * 15:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet * 15:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet * 15:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1340-1342].eqiad.wmnet * 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1340-1342].eqiad.wmnet * 15:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1337-1339].eqiad.wmnet * 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1337-1339].eqiad.wmnet * 15:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2137-2139].codfw.wmnet * 15:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet * 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2137-2139].codfw.wmnet * 15:01 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2134-2136].codfw.wmnet * 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2134-2136].codfw.wmnet * 14:57 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1337-1339].eqiad.wmnet * 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1337-1339].eqiad.wmnet * 14:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1334-1336].eqiad.wmnet * 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1334-1336].eqiad.wmnet * 14:54 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2134-2136].codfw.wmnet * 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2134-2136].codfw.wmnet * 14:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2131-2133].codfw.wmnet * 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2131-2133].codfw.wmnet * 14:48 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1334-1336].eqiad.wmnet * 14:48 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw * 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1334-1336].eqiad.wmnet * 14:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 14:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1331-1333].eqiad.wmnet * 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1331-1333].eqiad.wmnet * 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2131-2133].codfw.wmnet * 14:43 Daimona: Running queries to fixup data for [[phab:T426002|T426002]] * 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2131-2133].codfw.wmnet * 14:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2128-2130].codfw.wmnet * 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2128-2130].codfw.wmnet * 14:42 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe * 14:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1331-1333].eqiad.wmnet * 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1331-1333].eqiad.wmnet * 14:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1328-1330].eqiad.wmnet * 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1328-1330].eqiad.wmnet * 14:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2128-2130].codfw.wmnet * 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2128-2130].codfw.wmnet * 14:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2125-2127].codfw.wmnet * 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2125-2127].codfw.wmnet * 14:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1328-1330].eqiad.wmnet * 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2192 weight', diff saved to https://phabricator.wikimedia.org/P92573 and previous config saved to /var/cache/conftool/dbconfig/20260518-143320-fceratto.json * 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2149.codfw.wmnet with reason: Depooled host, will be decommissioned * 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2143.codfw.wmnet with reason: Depooled host, will be decommissioned * 14:31 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1328-1330].eqiad.wmnet * 14:31 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 14:30 bking@deploy1003: Finished deploy [wdqs/wdqs@e8fb00c]: 0.3.163 (duration: 31m 47s) * 14:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bookworm * 14:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2125-2127].codfw.wmnet * 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2125-2127].codfw.wmnet * 14:25 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2114-2115,2124].codfw.wmnet * 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2114-2115,2124].codfw.wmnet * 14:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bookworm * 14:24 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 14:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 14:22 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] (duration: 20m 05s) * 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 14:18 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2114-2115,2124].codfw.wmnet * 14:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2192: Repooling after switchover * 14:18 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2192: Repooling after switchover * 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2114-2115,2124].codfw.wmnet * 14:16 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe * 14:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2111-2113].codfw.wmnet * 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2111-2113].codfw.wmnet * 14:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2179: Repooling after switchover * 14:16 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2179: Repooling after switchover * 14:13 mlitn@deploy1003: Rolling back deployment * 14:13 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage * 14:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 14:10 Amir1: mapping thumbsize of 0 to 2 in all group1 wikis ([[phab:T376152|T376152]]) * 14:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2111-2113].codfw.wmnet * 14:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage * 14:08 mlitn@deploy1003: mlitn: Continuing with deployment * 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage * 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage * 14:05 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2111-2113].codfw.wmnet * 14:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2108-2110].codfw.wmnet * 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2108-2110].codfw.wmnet * 14:02 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] * 13:58 bking@deploy1003: Started deploy [wdqs/wdqs@e8fb00c]: 0.3.163 * 13:56 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] (duration: 09m 57s) * 13:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2108-2110].codfw.wmnet * 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm * 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bookworm * 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2108-2110].codfw.wmnet * 13:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2105-2107].codfw.wmnet * 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2105-2107].codfw.wmnet * 13:51 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Continuing with deployment * 13:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe * 13:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bookworm * 13:47 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:46 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2105-2107].codfw.wmnet * 13:46 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] * 13:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bookworm * 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2105-2107].codfw.wmnet * 13:44 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2102-2104].codfw.wmnet * 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2102-2104].codfw.wmnet * 13:44 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad] (duration: 01m 55s) * 13:42 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad] * 13:42 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 13:41 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad] (duration: 05m 05s) * 13:41 Lucas_WMDE: updateCollation arwikisource for [[phab:T426526|T426526]] finished * 13:41 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 13:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2102-2104].codfw.wmnet * 13:36 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad] * 13:36 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad] (duration: 01m 54s) * 13:34 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad] * 13:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2218'] * 13:33 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2218'] * 13:33 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: updateCollation arwikisource --previous-collation=uppercase # [[phab:T426526|T426526]] * 13:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage * 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] (duration: 11m 24s) * 13:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2102-2104].codfw.wmnet * 13:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2093-2095].codfw.wmnet * 13:30 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2093-2095].codfw.wmnet * 13:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage * 13:28 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Continuing with deployment * 13:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage * 13:25 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage * 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5003.eqsin.wmnet * 13:23 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2093-2095].codfw.wmnet * 13:23 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2093-2095].codfw.wmnet * 13:21 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] * 13:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2090-2092].codfw.wmnet * 13:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5003.eqsin.wmnet * 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2090-2092].codfw.wmnet * 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 13:19 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] (duration: 09m 18s) * 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 13:16 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm * 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Continuing with deployment * 13:14 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2090-2092].codfw.wmnet * 13:13 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bookworm * 13:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2090-2092].codfw.wmnet * 13:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2087-2089].codfw.wmnet * 13:12 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2087-2089].codfw.wmnet * 13:11 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] * 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet * 13:07 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] (duration: 11m 27s) * 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2087-2089].codfw.wmnet * 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2087-2089].codfw.wmnet * 13:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078].codfw.wmnet * 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078].codfw.wmnet * 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet * 13:00 kharlan@deploy1003: kharlan: Continuing with deployment * 12:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078].codfw.wmnet * 12:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet * 12:56 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] * 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet * 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078].codfw.wmnet * 12:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2073-2075].codfw.wmnet * 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92570 and previous config saved to /var/cache/conftool/dbconfig/20260518-125038-fceratto.json * 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2073-2075].codfw.wmnet * 12:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2073-2075].codfw.wmnet * 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2073-2075].codfw.wmnet * 12:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2070-2072].codfw.wmnet * 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2070-2072].codfw.wmnet * 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast7002.wikimedia.org * 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92569 and previous config saved to /var/cache/conftool/dbconfig/20260518-124030-fceratto.json * 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast7002.wikimedia.org * 12:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2070-2072].codfw.wmnet * 12:32 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2070-2072].codfw.wmnet * 12:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2067-2069].codfw.wmnet * 12:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2067-2069].codfw.wmnet * 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet * 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92568 and previous config saved to /var/cache/conftool/dbconfig/20260518-123022-fceratto.json * 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet * 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet * 12:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2067-2069].codfw.wmnet * 12:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bookworm * 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet * 12:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet * 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2067-2069].codfw.wmnet * 12:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2062,2064-2065].codfw.wmnet * 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2062,2064-2065].codfw.wmnet * 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92567 and previous config saved to /var/cache/conftool/dbconfig/20260518-122014-fceratto.json * 12:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bookworm * 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet * 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bookworm * 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet * 12:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2062,2064-2065].codfw.wmnet * 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2062,2064-2065].codfw.wmnet * 12:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2059-2061].codfw.wmnet * 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2059-2061].codfw.wmnet * 12:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bookworm * 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet * 12:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage * 12:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2059-2061].codfw.wmnet * 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2059-2061].codfw.wmnet * 12:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2056-2058].codfw.wmnet * 12:02 Dreamy_Jazz: Ran `scap remove-patch --message-body 'Dropping patch already made public' /srv/patches/next/extensions/ConfirmEdit/01-[[phab:T423840|T423840]].patch` * 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2056-2058].codfw.wmnet * 12:02 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage * 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet * 11:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage * 11:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2056-2058].codfw.wmnet * 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage * 11:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet * 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2056-2058].codfw.wmnet * 11:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2050-2051,2055].codfw.wmnet * 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2050-2051,2055].codfw.wmnet * 11:53 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage * 11:53 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2213 to s5 primary [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92566 and previous config saved to /var/cache/conftool/dbconfig/20260518-115304-fceratto.json * 11:52 federico3: Starting s5 codfw failover from db2192 to db2213 - [[phab:T426600|T426600]] * 11:51 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage * 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage * 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage * 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet * 11:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2050-2051,2055].codfw.wmnet * 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2213 with weight 0 [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92565 and previous config saved to /var/cache/conftool/dbconfig/20260518-114652-fceratto.json * 11:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426600|T426600]] * 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2050-2051,2055].codfw.wmnet * 11:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2044,2046,2049].codfw.wmnet * 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2044,2046,2049].codfw.wmnet * 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet * 11:41 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bookworm * 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm * 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bookworm * 11:38 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bookworm * 11:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2044,2046,2049].codfw.wmnet * 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2044,2046,2049].codfw.wmnet * 11:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2039,2041-2042].codfw.wmnet * 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2039,2041-2042].codfw.wmnet * 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bullseye * 11:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bullseye * 11:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2039,2041-2042].codfw.wmnet * 11:27 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm1001.wikimedia.org * 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2039,2041-2042].codfw.wmnet * 11:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2036-2038].codfw.wmnet * 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2036-2038].codfw.wmnet * 11:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bullseye * 11:24 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm1001.wikimedia.org * 11:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet * 11:21 slyngshede@dns1004: END - running authdns-update * 11:19 slyngshede@dns1004: START - running authdns-update * 11:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2036-2038].codfw.wmnet * 11:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet * 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4006.wikimedia.org * 11:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage * 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2036-2038].codfw.wmnet * 11:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2033-2035].codfw.wmnet * 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2033-2035].codfw.wmnet * 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast4006.wikimedia.org * 11:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage * 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm2001.wikimedia.org * 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org * 11:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2033-2035].codfw.wmnet * 11:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage * 11:09 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm2001.wikimedia.org * 11:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm-test1001.wikimedia.org * 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2033-2035].codfw.wmnet * 11:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2016-2018].codfw.wmnet * 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2016-2018].codfw.wmnet * 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage * 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage * 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage * 11:05 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm-test1001.wikimedia.org * 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org * 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp1005.wikimedia.org * 11:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2016-2018].codfw.wmnet * 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2016-2018].codfw.wmnet * 11:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2013-2015].codfw.wmnet * 11:00 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp1005.wikimedia.org * 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2013-2015].codfw.wmnet * 10:56 slyngshede@dns1004: END - running authdns-update * 10:55 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bullseye * 10:54 slyngshede@dns1004: START - running authdns-update * 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bullseye * 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bullseye * 10:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2013-2015].codfw.wmnet * 10:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2013-2015].codfw.wmnet * 10:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2006,2011-2012].codfw.wmnet * 10:51 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2006,2011-2012].codfw.wmnet * 10:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe * 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp2005.wikimedia.org * 10:46 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp2005.wikimedia.org * 10:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org * 10:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2006,2011-2012].codfw.wmnet * 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2006,2011-2012].codfw.wmnet * 10:42 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2001-2002,2005].codfw.wmnet * 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2001-2002,2005].codfw.wmnet * 10:41 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org * 10:41 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2005.wikimedia.org * 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org * 10:37 slyngshede@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host idp-test2005.wikimedia.org * 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org * 10:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2001-2002,2005].codfw.wmnet * 10:33 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2001-2002,2005].codfw.wmnet * 10:33 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet * 10:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet * 10:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet * 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4', diff saved to https://phabricator.wikimedia.org/P92564 and previous config saved to /var/cache/conftool/dbconfig/20260518-101917-marostegui.json * 10:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2024: replacing hw * 10:18 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99) * 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2024: replacing hw * 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet * 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to pc4 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92563 and previous config saved to /var/cache/conftool/dbconfig/20260518-101749-marostegui.json * 10:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet * 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet * 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92562 and previous config saved to /var/cache/conftool/dbconfig/20260518-101714-marostegui.json * 10:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet * 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply * 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply * 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92561 and previous config saved to /var/cache/conftool/dbconfig/20260518-100831-fceratto.json * 10:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance * 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet * 10:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet * 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet * 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92560 and previous config saved to /var/cache/conftool/dbconfig/20260518-100710-fceratto.json * 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2240 to s4 primary [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92559 and previous config saved to /var/cache/conftool/dbconfig/20260518-100203-fceratto.json * 10:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet * 10:00 federico3: Starting s4 codfw failover from db2179 to db2240 - [[phab:T426590|T426590]] * 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1086.eqiad.wmnet with OS bullseye * 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet * 09:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet * 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet * 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2240 with weight 0 [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92558 and previous config saved to /var/cache/conftool/dbconfig/20260518-095218-fceratto.json * 09:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426590|T426590]] * 09:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet * 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet * 09:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet * 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet * 09:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1082.eqiad.wmnet with OS bullseye * 09:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage * 09:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet * 09:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet * 09:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet * 09:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 09:37 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage * 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet * 09:29 cezmunsta: Removing db2152.codfw.wmnet from orchestrator [[phab:T424344|T424344]] * 09:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage * 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1086 * 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1086 * 09:25 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1086 * 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 09:25 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002" * 09:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage * 09:25 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002" * 09:21 mvernon@cumin2002: START - Cookbook sre.dns.netbox * 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1086 * 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1086.eqiad.wmnet with OS bullseye * 09:18 moritzm: installing Java 21 security updates * 09:13 cezmunsta: Removing db2152.codfw.wmnet from zarcillo [[phab:T424344|T424344]] * 09:11 jnuche@deploy1003: Installation of scap version "4.265.2" completed for 1 hosts * 09:10 jnuche@deploy1003: Installing scap version "4.265.2" for 1 host(s) * 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2152.codfw.wmnet * 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 09:08 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1082 * 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1082 * 09:07 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1082 * 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 09:07 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002" * 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002" * 09:05 jnuche@deploy1003: Installing scap version "4.265.2" for 163 host(s) * 09:04 cwilliams@cumin1003: START - Cookbook sre.dns.netbox * 09:03 ayounsi@dns1004: END - running authdns-update * 09:02 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] (duration: 31m 35s) * 09:01 ayounsi@dns1004: START - running authdns-update * 08:59 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2152.codfw.wmnet * 08:50 javiermonton@deploy1003: javiermonton: Continuing with deployment * 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 08:50 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 08:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox * 08:41 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1082 * 08:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1082.eqiad.wmnet with OS bullseye * 08:31 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] * 08:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet * 08:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply * 08:14 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply * 08:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet * 08:12 moritzm: installing glibc bugfix updates from bookworm point release * 07:46 moritzm: installing systemd bugfix updates from bookworm point release * 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet * 07:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet * 07:35 moritzm: installing openssl bugfix updates from bookworm point release * 07:25 godog: clean up space on cloudcumin1001: apt archives and older kernels * 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2013.codfw.wmnet * 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 07:14 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 07:09 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 07:04 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2013.codfw.wmnet * 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2013 from dbctl [[phab:T426555|T426555]]', diff saved to https://phabricator.wikimedia.org/P92557 and previous config saved to /var/cache/conftool/dbconfig/20260518-070322-marostegui.json * 06:59 moritzm: installing systemd bugfix updates from trixie point release * 06:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Kgraessle out of all services on: 2468 hosts * 06:54 moritzm: installing Linux 6.1.172 on bookworm hosts * 06:49 moritzm: installing glibc bugfix updates from trixie point release * 06:44 moritzm: installing openssl bugfix updates from trixie point release * 06:33 moritzm: installing Linux 6.12.88 on trixie hosts * 05:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99) * 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 05:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0) * 05:11 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 05:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2014,2024].codfw.wmnet,pc1014.eqiad.wmnet with reason: Maintenance on pc4 == 2026-05-15 == * 21:03 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] (duration: 07m 43s) * 20:59 jforrester@deploy1003: jforrester, seddon: Continuing with deployment * 20:57 jforrester@deploy1003: jforrester, seddon: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:55 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] * 20:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS bookworm * 20:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:09 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 19:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage * 19:47 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage * 19:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm * 19:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:23 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290 * 19:21 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290 * 19:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox * 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 16:02 dancy@deploy1003: Installation of scap version "4.265.1" completed for 2 hosts * 16:00 dancy@deploy1003: Installing scap version "4.265.1" for 2 host(s) * 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003" * 12:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003" * 12:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2009.codfw.wmnet * 11:59 Emperor: depool / restart swift / repool on ms-fe2010 ms-fe2012 * 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe2009.codfw.wmnet * 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2065.codfw.wmnet with OS bullseye * 11:14 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 11:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]] * 11:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage * 10:59 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage * 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2064.codfw.wmnet with OS bullseye * 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:46 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie * 10:43 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 10:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2065 * 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2065 * 10:40 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2065 * 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 10:40 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002" * 10:40 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002" * 10:36 mvernon@cumin2002: START - Cookbook sre.dns.netbox * 10:36 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2065 * 10:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2065.codfw.wmnet with OS bullseye * 10:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage * 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:28 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage * 10:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage * 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:20 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage * 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003" * 10:12 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003" * 10:10 topranks: Migrate ulsfo cr<->cr traffic to use path via switches not direct link [[phab:T424611|T424611]] * 10:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 10:04 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye * 10:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 10:01 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]] * 10:00 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 09:56 topranks: Migrate cr3-ulsfo link to asw1-22-ulsfo to tagged interface [[phab:T424611|T424611]] * 09:49 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply * 09:48 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 09:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 09:32 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2064.codfw.wmnet with OS bullseye * 09:32 topranks: Migrate cr4-ulsfo link to asw1-23-ulsfo to tagged interface [[phab:T424611|T424611]] * 09:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply * 09:30 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply * 09:30 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2065 * 09:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply * 09:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on db2218.codfw.wmnet with reason: Host crashed [[phab:T426383|T426383]] * 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2064 * 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2064 * 09:06 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2064 * 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 09:06 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002" * 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002" * 09:03 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 09:02 mvernon@cumin2002: START - Cookbook sre.dns.netbox * 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2064 * 09:01 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye * 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2218 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92553 and previous config saved to /var/cache/conftool/dbconfig/20260515-090000-marostegui.json * 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2220 to s7 primary [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92552 and previous config saved to /var/cache/conftool/dbconfig/20260515-085836-marostegui.json * 08:56 marostegui: Starting s7 codfw failover from db2218 to db2220 - [[phab:T426380|T426380]] * 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426380|T426380]] * 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2220 with weight 0 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92551 and previous config saved to /var/cache/conftool/dbconfig/20260515-085420-marostegui.json * 08:41 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2065 * 08:41 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2064 * 08:28 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie * 08:17 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie * 08:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 08:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 08:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 08:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:55 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:55 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2064 * 07:54 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2010 * 07:39 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2010 * 07:10 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 02:31 vriley@cumin1003: START - Cookbook sre.dns.netbox * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 50s) * 02:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1290.eqiad.wmnet with OS bookworm * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm * 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS bookworm * 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage * 00:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage * 00:43 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:42 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm * 00:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:01 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED == 2026-05-14 == * 23:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:57 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290 * 23:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290 * 23:54 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:49 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:39 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:30 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:26 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:13 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:12 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289 * 23:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289 * 23:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 23:07 vriley@cumin1003: START - Cookbook sre.dns.netbox * 21:47 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] (duration: 07m 14s) * 21:43 egardner@deploy1003: egardner: Continuing with deployment * 21:41 egardner@deploy1003: egardner: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:40 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] * 21:33 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] (duration: 09m 15s) * 21:29 jdrewniak@deploy1003: jdrewniak: Continuing with deployment * 21:26 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:24 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] * 21:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] (duration: 06m 33s) * 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS bookworm * 21:15 vriley@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:12 dreamyjazz@deploy1003: dreamyjazz, seddon: Continuing with deployment * 21:11 dreamyjazz@deploy1003: dreamyjazz, seddon: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:10 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] * 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS bookworm * 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:55 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] (duration: 07m 03s) * 20:46 sbisson@deploy1003: sbisson: Continuing with deployment * 20:45 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] * 20:43 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . * 20:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage * 20:35 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage * 20:35 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] (duration: 10m 18s) * 20:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:31 cjming@deploy1003: cjming, neriah: Continuing with deployment * 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:29 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1289.eqiad.wmnet with OS bookworm * 20:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS bookworm * 20:27 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:25 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] * 20:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage * 20:19 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS bookworm * 20:19 jsn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] (duration: 07m 48s) * 20:18 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage * 20:14 jsn@deploy1003: kgraessle, jsn: Continuing with deployment * 20:13 jsn@deploy1003: kgraessle, jsn: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:11 jsn@deploy1003: Started scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] * 20:03 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . * 20:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS bookworm * 19:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS bookworm * 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 19:46 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 19:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage * 19:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1286 * 19:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1286 * 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003" * 19:26 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003" * 19:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage * 19:22 vriley@cumin1003: START - Cookbook sre.dns.netbox * 19:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS bookworm * 19:14 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 19:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS bookworm * 18:58 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 18:57 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 18:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 18:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 18:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage * 18:25 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage * 18:17 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 18:16 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 18:14 vriley@cumin1003: START - Cookbook sre.dns.netbox * 18:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm * 17:32 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 17:31 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 17:23 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]] * 17:17 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply * 17:17 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply * 17:16 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply * 17:16 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply * 17:16 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply * 17:15 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply * 17:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]] * 17:10 cmooney@dns2005: END - running authdns-update * 17:09 cmooney@dns2005: START - running authdns-update * 17:06 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]] * 16:58 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]] * 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 16:35 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie * 16:31 topranks: disable core router direct link at esams now that traffic is flowing via switches [[phab:T424611|T424611]] * 16:25 topranks: disable core router direct link at drmrs now that traffic is flowing via switches [[phab:T424611|T424611]] * 16:21 topranks: disable core router direct link at magru now that traffic is flowing via switches [[phab:T424611|T424611]] * 16:20 rzl@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply * 16:20 rzl@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply * 16:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply * 16:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply * 16:16 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 16:15 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 16:14 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS bookworm * 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 16:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 16:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply * 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003" * 16:07 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003" * 16:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply * 16:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 15:59 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 15:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:56 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290 * 15:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290 * 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003" * 15:54 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003" * 15:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage * 15:51 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003 * 15:50 vriley@cumin1003: START - Cookbook sre.dns.netbox * 15:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage * 15:49 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003 * 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1285.eqiad.wmnet with OS bookworm * 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 15:46 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie * 15:45 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289 * 15:41 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289 * 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003" * 15:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003" * 15:35 vriley@cumin1003: START - Cookbook sre.dns.netbox * 15:33 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS bookworm * 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS bookworm * 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 15:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 15:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage * 15:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 15:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage * 15:16 bearloga@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] (duration: 06m 20s) * 15:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage * 15:12 bearloga@deploy1003: bearloga: Continuing with deployment * 15:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:12 bearloga@deploy1003: bearloga: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:10 bearloga@deploy1003: Started scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] * 15:08 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage * 15:08 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1285.eqiad.wmnet with OS bookworm * 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage * 14:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92544 and previous config saved to /var/cache/conftool/dbconfig/20260514-145715-fceratto.json * 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1288 * 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS bookworm * 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:54 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage * 14:54 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1288 * 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003" * 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003" * 14:52 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS bookworm * 14:49 vriley@cumin1003: START - Cookbook sre.dns.netbox * 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92542 and previous config saved to /var/cache/conftool/dbconfig/20260514-144707-fceratto.json * 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:44 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage * 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003" * 14:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003" * 14:37 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host db1289 * 14:37 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289 * 14:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92541 and previous config saved to /var/cache/conftool/dbconfig/20260514-143659-fceratto.json * 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS bookworm * 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:34 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] (duration: 11m 14s) * 14:33 vriley@cumin1003: START - Cookbook sre.dns.netbox * 14:33 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage * 14:33 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1285 * 14:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1285 * 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003" * 14:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003" * 14:29 phuedx@deploy1003: phuedx: Continuing with deployment * 14:27 vriley@cumin1003: START - Cookbook sre.dns.netbox * 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92540 and previous config saved to /var/cache/conftool/dbconfig/20260514-142650-fceratto.json * 14:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 14:24 phuedx@deploy1003: phuedx: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS bookworm * 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:22 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] * 14:21 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1284 * 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92539 and previous config saved to /var/cache/conftool/dbconfig/20260514-141922-fceratto.json * 14:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance * 14:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage * 14:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS bookworm * 14:18 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2150 from dbctl [[phab:T424342|T424342]]', diff saved to https://phabricator.wikimedia.org/P92538 and previous config saved to /var/cache/conftool/dbconfig/20260514-141812-cwilliams.json * 14:17 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1284 * 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003" * 14:17 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003" * 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92537 and previous config saved to /var/cache/conftool/dbconfig/20260514-141644-fceratto.json * 14:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage * 14:14 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] (duration: 08m 00s) * 14:13 vriley@cumin1003: START - Cookbook sre.dns.netbox * 14:09 krinkle@deploy1003: krinkle, robertsky: Continuing with deployment * 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 14:08 krinkle@deploy1003: krinkle, robertsky: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS bookworm * 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92536 and previous config saved to /var/cache/conftool/dbconfig/20260514-140635-fceratto.json * 14:06 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] * 14:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage * 14:01 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2151 from dbctl [[phab:T424343|T424343]]', diff saved to https://phabricator.wikimedia.org/P92535 and previous config saved to /var/cache/conftool/dbconfig/20260514-140110-cwilliams.json * 14:00 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] (duration: 07m 09s) * 13:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS bookworm * 13:58 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage * 13:57 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 13:56 mfossati@deploy1003: mfossati: Continuing with deployment * 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92534 and previous config saved to /var/cache/conftool/dbconfig/20260514-135626-fceratto.json * 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 13:55 mfossati@deploy1003: mfossati: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 13:53 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] * 13:53 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2152.codfw.wmnet with reason: Depooled host, will be decommissioned * 13:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART * 13:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92533 and previous config saved to /var/cache/conftool/dbconfig/20260514-135315-fceratto.json * 13:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance * 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2151.codfw.wmnet with reason: Depooled host, will be decommissioned * 13:52 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2150.codfw.wmnet with reason: Depooled host, will be decommissioned * 13:49 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] (duration: 07m 03s) * 13:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage * 13:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART * 13:45 krinkle@deploy1003: krinkle: Continuing with deployment * 13:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage * 13:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART * 13:44 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS bookworm * 13:42 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] * 13:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] (duration: 12m 33s) * 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART * 13:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:37 krinkle@deploy1003: krinkle, annet: Continuing with deployment * 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2151: Host will be decommissioned * 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2151: Host will be decommissioned * 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Host will be decommissioned * 13:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Host will be decommissioned * 13:31 krinkle@deploy1003: krinkle, annet: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS bookworm * 13:29 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] * 13:22 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1283 * 13:19 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:18 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1283 * 13:16 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] (duration: 08m 10s) * 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003" * 13:15 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003" * 13:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:12 sbisson@deploy1003: sbisson: Continuing with deployment * 13:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:10 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:10 vriley@cumin1003: START - Cookbook sre.dns.netbox * 13:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Repooling after switchover * 13:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1282 * 13:08 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] * 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:08 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Repooling after switchover * 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92529 and previous config saved to /var/cache/conftool/dbconfig/20260514-130743-fceratto.json * 13:07 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1282 * 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003" * 13:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003" * 13:05 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2161 to s8 primary [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92528 and previous config saved to /var/cache/conftool/dbconfig/20260514-130213-fceratto.json * 13:01 vriley@cumin1003: START - Cookbook sre.dns.netbox * 13:01 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1281 * 13:00 federico3: Starting s8 codfw failover from db2165 to db2161 - [[phab:T426291|T426291]] * 13:00 kart_: Updated cxserver to 2026-05-14-123010-production ([[phab:T426174|T426174]], [[phab:T404298|T404298]]) * 12:59 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1281 * 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003" * 12:59 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003" * 12:58 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply * 12:57 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply * 12:56 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply * 12:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 12:55 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply * 12:55 vriley@cumin1003: START - Cookbook sre.dns.netbox * 12:54 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1280 * 12:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1280 * 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003" * 12:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003" * 12:50 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2161 with weight 0 [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92527 and previous config saved to /var/cache/conftool/dbconfig/20260514-125014-fceratto.json * 12:49 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1279 * 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s8 [[phab:T426291|T426291]] * 12:49 vriley@cumin1003: START - Cookbook sre.dns.netbox * 12:47 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1279 * 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003" * 12:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003" * 12:47 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply * 12:46 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply * 12:42 vriley@cumin1003: START - Cookbook sre.dns.netbox * 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003 * 12:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003 * 12:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 28458 * 12:27 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 28458 * 12:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 with pc2023 as codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92526 and previous config saved to /var/cache/conftool/dbconfig/20260514-122707-marostegui.json * 12:21 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 12:21 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 12:20 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92525 and previous config saved to /var/cache/conftool/dbconfig/20260514-121958-marostegui.json * 12:18 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92524 and previous config saved to /var/cache/conftool/dbconfig/20260514-121839-marostegui.json * 11:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply * 11:31 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply * 11:08 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply * 11:08 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply * 11:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 11:01 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: sync * 11:00 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: sync * 11:00 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply * 11:00 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply * 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply * 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply * 10:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bullseye * 10:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bullseye * 10:45 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2152 from dbctl [[phab:T424344|T424344]]', diff saved to https://phabricator.wikimedia.org/P92523 and previous config saved to /var/cache/conftool/dbconfig/20260514-104521-marostegui.json * 10:41 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'. * 10:40 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'. * 10:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage * 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply * 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply * 10:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage * 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage * 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage * 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bullseye * 10:14 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bullseye * 10:14 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 10:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2152: Host will be decommissioned * 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152: Host will be decommissioned * 09:54 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db2152.codfw.wmnet: Host will be decommissioned * 09:51 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply * 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply * 09:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152.codfw.wmnet: Host will be decommissioned * 09:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bullseye * 09:33 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bullseye * 09:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bullseye * 09:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bullseye * 09:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage * 09:20 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]] * 09:18 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage * 09:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage * 09:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage * 09:06 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage * 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage * 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage * 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage * 08:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 08:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bullseye * 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bullseye * 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bullseye * 08:54 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bullseye * 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2149 [[phab:T424341|T424341]]', diff saved to https://phabricator.wikimedia.org/P92520 and previous config saved to /var/cache/conftool/dbconfig/20260514-083916-marostegui.json * 08:08 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] * 07:01 kart_: Update cxserver to 2026-04-23-114216-production ([[phab:T423002|T423002]]) * 07:00 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply * 07:00 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply * 06:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2013,2023].codfw.wmnet,pc1013.eqiad.wmnet with reason: Maintenance on pc3 * 06:40 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply * 06:40 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply * 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2013: Replacing HW [[phab:T418973|T418973]] * 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0) * 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2013: Replacing HW [[phab:T418973|T418973]] * 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1158: after reimage to trixie * 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1158: after reimage to trixie * 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1158.eqiad.wmnet with OS trixie * 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage * 05:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage * 05:12 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1158.eqiad.wmnet with OS trixie * 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1158: Reimage to Trixie * 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1158: Reimage to Trixie * 05:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Reimage to Trixie * 05:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s7 master: reimage to Debian Trixie * 05:04 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 00:07 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 == 2026-05-13 == * 21:12 Amir1: remapping thumbsize of 0 to 2 in all group0 wikis ([[phab:T376152|T376152]]) * 21:06 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 20:55 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] (duration: 07m 48s) * 20:51 jdlrobson@deploy1003: ladsgroup, jdlrobson: Continuing with deployment * 20:49 jdlrobson@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:47 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] * 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:43 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] (duration: 07m 32s) * 20:42 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 20:38 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 20:37 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:35 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] * 20:33 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] (duration: 07m 26s) * 20:28 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 20:27 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:25 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] * 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:23 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 06s) * 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:19 ebernhardson@deploy1003: ebernhardson: Continuing with deployment * 20:18 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 20:16 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] * 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] (duration: 06m 47s) * 20:13 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 20:09 cjming@deploy1003: bpirkle, cjming: Continuing with deployment * 20:09 cjming@deploy1003: bpirkle, cjming: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:07 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] * 19:23 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply * 19:23 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply * 19:09 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply * 19:09 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply * 18:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply * 18:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply * 18:27 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply * 18:26 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply * 18:25 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply * 18:25 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply * 18:20 cmooney@dns2005: END - running authdns-update * 18:19 cmooney@dns2005: START - running authdns-update * 18:14 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply * 18:13 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply * 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003" * 18:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003" * 18:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 18:01 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply * 18:00 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply * 17:50 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply * 17:50 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: apply * 17:47 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply * 17:47 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply * 17:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 17:43 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply * 17:42 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply * 17:36 topranks: update OSPF config on magru core routers to shift traffic to switch links [[phab:T424611|T424611]] * 17:34 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply * 17:33 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply * 17:28 mutante: zuul1001 systemctl start zuul-scheduler ; /usr/bin/docker exec zuul-scheduler zuul-scheduler smart-reconfigure * 17:26 mutante: zuul1001 - stopping zuul-web; then manually running: /usr/sbin/usermod -u 923 zuul * 17:26 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply * 17:26 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply * 17:24 topranks: update OSPF config on esams core routers to shift traffic to switch links [[phab:T424611|T424611]] * 17:20 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply * 17:19 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply * 17:05 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet * 17:03 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet * 16:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet * 16:55 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet * 16:43 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 16:29 topranks: update OSPF config on drmrs core routers to shift traffic to switch links [[phab:T424611|T424611]] * 16:20 topranks: update OSPF config on eqsin core routers to shift traffic to switch links [[phab:T424611|T424611]] * 16:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 16:10 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 15:53 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:42 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:37 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:37 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7009.* * 15:36 fabfur: repooling cp7009 to test haproxy-awslc behavior ([[phab:T419825|T419825]]) * 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:27 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7009.* * 15:27 fabfur: depooling cp7009 to install haproxy-awslc ([[phab:T419825|T419825]]) * 15:18 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 15:16 cmooney@dns2005: END - running authdns-update * 15:15 cmooney@dns2005: START - running authdns-update * 15:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 15:04 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003" * 15:04 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003" * 15:01 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:57 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:53 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003" * 14:53 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003" * 14:51 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003 * 14:50 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 14:49 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki-root1002.eqiad.wmnet with OS trixie * 14:42 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] (duration: 07m 17s) * 14:37 kharlan@deploy1003: kharlan: Continuing with deployment * 14:36 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:34 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] * 14:33 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003 * 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002" * 14:33 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003 * 14:33 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002" * 14:28 pt1979@cumin2002: START - Cookbook sre.dns.netbox * 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:25 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage * 14:19 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] (duration: 06m 35s) * 14:17 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 14:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage * 14:16 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 14:15 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003 * 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 14:15 jforrester@deploy1003: jforrester: Continuing with deployment * 14:15 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 14:14 jforrester@deploy1003: jforrester: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:14 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 14:14 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 14:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] * 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 14:10 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 14:10 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 14:09 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 14:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 14:08 Lucas_WMDE: UTC afternoon backport+config window done * 14:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * {{safesubst:SAL entry|1=14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAl}} * 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:03 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Continuing with deployment * 14:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7001.* * 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org * 14:02 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply * 14:01 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/mathoid: apply * 14:01 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/mathoid: apply * {{safesubst:SAL entry|1=14:01 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-AP}} * 14:01 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/mathoid: apply * 14:00 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/mathoid: apply * 14:00 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki-root1002.eqiad.wmnet with OS trixie * 13:59 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/mathoid: apply * 13:59 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * {{safesubst:SAL entry|1=13:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAll}} * 13:58 fabfur: repooling cp7001 to test haproxy-awslc behavior ([[phab:T419825|T419825]]) * 13:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org * 13:50 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] (duration: 07m 36s) * 13:49 jmm@cumin2002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002 * 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment * 13:44 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:42 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] * {{safesubst:SAL entry|1=13:40 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers t}} * 13:36 mfossati@deploy1003: jdlrobson, mfossati: Continuing with deployment * {{safesubst:SAL entry|1=13:29 mfossati@deploy1003: jdlrobson, mfossati: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers that d}} * 13:28 jmm@cumin2002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002 * 13:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * {{safesubst:SAL entry|1=13:27 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers th}} * 13:25 moritzm: installing openjdk-11 security updates * 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] (duration: 08m 18s) * 13:07 sbisson@deploy1003: sbisson: Continuing with deployment * 13:05 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:04 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=pki,name=codfw * 13:03 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] * 12:50 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] (duration: 06m 42s) * 12:46 mszwarc@deploy1003: mszwarc: Continuing with deployment * 12:45 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] * 12:41 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7001.* * 12:40 fabfur: depool cp7001 to test haproxy-awslc (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1286526) ([[phab:T419825|T419825]]) * 12:38 topranks: add ibgp peering between cr1-magru and cr2-magru over loopback IPs [[phab:T424611|T424611]] * 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Migration of db1236.eqiad.wmnet completed * 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet * 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet * 12:02 topranks: add ibgp peering between cr1-esams and cr2-esams over loopback IPs [[phab:T424611|T424611]] * 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003" * 11:57 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003" * 11:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2220: after reimage to trixie * 11:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 11:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Migration of db1236.eqiad.wmnet completed * 11:44 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'. * 11:43 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'. * 11:43 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1236.eqiad.wmnet with OS trixie * 11:40 topranks: delete old direct ibgp peering between cr1-drms and cr2-drmrs [[phab:T424611|T424611]] * 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 11:27 topranks: add ibgp peering between cr1-drms and cr2-drmrs over loopback IPs [[phab:T424611|T424611]] * 11:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage * 11:24 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 11:24 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 11:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage * 11:19 moritzm: installing Linux 6.1.170-3 on all Bookworm hosts * 11:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki2002.codfw.wmnet with OS trixie * 11:10 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2220: after reimage to trixie * 11:06 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1236.eqiad.wmnet with OS trixie * 11:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Upgrading db1236.eqiad.wmnet * 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Upgrading db1236.eqiad.wmnet * 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade * 10:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2220.codfw.wmnet with OS trixie * 10:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 10:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org * 10:52 moritzm: installing Linux 5.10.251-4 on all Bullseye hosts * 10:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage * 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org * 10:42 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage * 10:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply * 10:39 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply * 10:39 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply * 10:39 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply * 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2220.codfw.wmnet with reason: host reimage * 10:33 topranks: switch eqsin core router ibgp path to route via switches [[phab:T424611|T424611]] * 10:26 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2220.codfw.wmnet with reason: host reimage * 10:25 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki2002.codfw.wmnet with OS trixie * 10:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 10:22 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 10:21 elukey@cumin1003: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki2002.codfw.wmnet * 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply * 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply * 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply * 10:16 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 10:15 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply * 10:15 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 10:14 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 10:12 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 10:10 moritzm: installing Apache security updates on Bullseye * 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 10:06 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2220.codfw.wmnet with OS trixie * 10:05 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply * 10:05 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bullseye * 10:04 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply * 10:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2220: Reimage to Trixie * 10:02 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2220: Reimage to Trixie * 10:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Reimage to Trixie * 10:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply * 10:01 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply * 09:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2220 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92500 and previous config saved to /var/cache/conftool/dbconfig/20260513-095934-marostegui.json * 09:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2218 to s7 primary [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92499 and previous config saved to /var/cache/conftool/dbconfig/20260513-095814-marostegui.json * 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 09:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bullseye * 09:56 moritzm: installing distro-info-data updates from Bookworm point release * 09:54 marostegui: Starting s7 codfw failover from db2220 to db2218 - [[phab:T426142|T426142]] * 09:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426142|T426142]] * 09:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bullseye * 09:53 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2218 with weight 0 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92498 and previous config saved to /var/cache/conftool/dbconfig/20260513-095337-marostegui.json * 09:51 moritzm: installing ca-certificates update from Bookworm point release * 09:50 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bullseye * 09:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage * 09:45 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] (duration: 09m 01s) * 09:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage * 09:41 kharlan@deploy1003: kharlan: Continuing with deployment * 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage * 09:38 kharlan@deploy1003: kharlan: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 09:36 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] * 09:34 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage * 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage * 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage * 09:29 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage * 09:29 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage * 09:28 cmooney@dns2005: END - running authdns-update * 09:27 cmooney@dns2005: START - running authdns-update * 09:27 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]] * 09:25 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki2002.codfw.wmnet * 09:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 09:22 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: reimage * 09:21 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]] * 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bullseye * 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bullseye * 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bullseye * 09:17 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bullseye * 09:14 elukey@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=pki,name=codfw * 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003" * 09:10 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003" * 09:07 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 08:45 moritzm: installing dnsmasq security updates * 08:40 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply * 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003" * 08:38 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply * 08:38 cmooney@dns2005: END - running authdns-update * 08:37 cmooney@dns2005: START - running authdns-update * 08:36 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply * 08:35 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] * 08:32 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003" * 08:32 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply * 08:28 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 08:25 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply * 08:25 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply * 08:24 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] (duration: 09m 18s) * 08:20 kharlan@deploy1003: kharlan: Continuing with deployment * 08:16 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 08:14 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] * 08:11 moritzm: imported dnsmasq 2.92-1~wmf13u2 to trixie-wikimedia/main (backport of latest dnsmasq security fixes to our internal build) * 08:08 topranks: reconfigure link from cr4-ulsfo to asw1-22-ulsfo as 802.1q tagged [[phab:T424611|T424611]] * 07:56 moritzm: imported dnsmasq 2.92-1~wmf12u2 to bookworm-wikimedia/main (backport of latest dnsmasq security fixes to our internal build) * 07:47 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] (duration: 09m 09s) * 07:43 dcausse@deploy1003: atsuko, dcausse: Continuing with deployment * 07:40 dcausse@deploy1003: atsuko, dcausse: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:39 gkyziridis@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync * 07:39 gkyziridis@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync * 07:38 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] * 07:37 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync * 07:37 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync * 07:34 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 09m 32s) * 07:30 dcausse@deploy1003: dcausse, wmde-fisch: Continuing with deployment * 07:27 dcausse@deploy1003: dcausse, wmde-fisch: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:25 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] * 07:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply * 07:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply * 07:17 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply * 07:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply * 07:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: after reimage to trixie * 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: after reimage to trixie * 06:39 moritzm: installing Exim security updates on the hosts where Exim is used as a local mail relay * 06:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: after reimage to trixie * 06:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2218.codfw.wmnet with OS trixie * 06:26 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1253: after reimage to trixie * 06:22 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1253.eqiad.wmnet with OS trixie * 06:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2218.codfw.wmnet with reason: host reimage * 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage * 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2218.codfw.wmnet with reason: host reimage * 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage * 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1253.eqiad.wmnet with OS trixie * 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2218.codfw.wmnet with OS trixie * 05:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1253: Reimage to Trixie * 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2218: Reimage to Trixie * 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1253: Reimage to Trixie * 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Reimage to Trixie * 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2218: Reimage to Trixie * 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Reimage to Trixie * 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS bookworm * 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 04:20 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 04:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage * 03:57 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage * 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS bookworm * 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 03:42 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 03:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS bookworm * 03:28 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS bookworm * 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 03:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 03:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage * 03:17 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage * 03:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1278 * 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage * 03:08 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1278 * 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003" * 03:07 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003" * 03:04 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage * 03:03 vriley@cumin1003: START - Cookbook sre.dns.netbox * 03:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS bookworm * 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:49 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS bookworm * 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS bookworm * 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 02:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1277 * 02:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1277 * 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003" * 02:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003" * 02:21 vriley@cumin1003: START - Cookbook sre.dns.netbox * 02:19 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1274.eqiad.wmnet with OS bookworm * 02:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage * 02:16 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:15 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1276 * 02:13 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage * 02:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1276 * 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003" * 02:10 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003" * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s) * 02:07 vriley@cumin1003: START - Cookbook sre.dns.netbox * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:58 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS bookworm * 01:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 01:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] (duration: 06m 35s) * 01:28 zabe@deploy1003: zabe: Continuing with deployment * 01:27 zabe@deploy1003: zabe: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 01:27 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm * 01:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] * 01:18 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 01:14 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1275 * 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 01:12 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1275 * 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003" * 01:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003" * 01:08 vriley@cumin1003: START - Cookbook sre.dns.netbox * 00:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:58 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1274 * 00:57 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1274 * 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003" * 00:56 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003" * 00:52 vriley@cumin1003: START - Cookbook sre.dns.netbox * 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS bookworm * 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 00:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" == 2026-05-12 == * 23:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage * 23:48 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage * 23:46 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] (duration: 12m 45s) * 23:40 cscott@deploy1003: cscott: Continuing with deployment * 23:39 cscott@deploy1003: cscott: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:33 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] * 23:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] (duration: 33m 28s) * 23:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS bookworm * 22:53 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 22:49 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS bookworm * 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 22:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 22:32 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] * 22:21 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage * 22:21 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage * 22:18 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] (duration: 34m 01s) * 22:05 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 22:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 22:01 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:59 dwisehaupt@dns1004: END - running authdns-update * 21:57 dwisehaupt@dns1004: START - running authdns-update * 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS bookworm * 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:46 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:43 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] * 21:42 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:41 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1273 * 21:40 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS bookworm * 21:39 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1273 * 21:38 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] (duration: 11m 56s) * 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003" * 21:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003" * 21:32 vriley@cumin1003: START - Cookbook sre.dns.netbox * 21:31 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Continuing with deployment * 21:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage * 21:29 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003" * 21:27 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003" * 21:26 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] * 21:23 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 21:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage * 21:19 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] (duration: 14m 51s) * 21:15 cscott@deploy1003: cscott: Continuing with deployment * 21:15 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]] * 21:07 cscott@deploy1003: cscott: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Change * 21:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm * 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS bookworm * 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:05 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003" * 21:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003" * 21:05 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] * 21:03 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]] * 21:01 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003" * 20:54 topranks: migrate link from cr2-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]] * 20:51 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1271.eqiad.wmnet with OS bookworm * 20:50 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] (duration: 09m 03s) * 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage * 20:46 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003" * 20:46 samtar@deploy1003: samtar, dreamrimmer: Continuing with deployment * 20:44 topranks: migrate link from cr1-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]] * 20:43 samtar@deploy1003: samtar, dreamrimmer: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:42 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage * 20:41 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 20:41 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage * 20:41 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage * 20:41 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] * 20:35 topranks: migrate link from cr2-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]] * 20:26 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] (duration: 08m 27s) * 20:25 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm * 20:23 topranks: migrate link from cr1-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]] * 20:20 dbrant@deploy1003: dbrant: Continuing with deployment * 20:20 dbrant@deploy1003: dbrant: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:18 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] * 20:16 topranks: migrate link from cr2-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]] * 20:15 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] (duration: 11m 47s) * 20:11 alexsanford@deploy1003: alexsanford: Continuing with deployment * 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm * 20:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:05 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:05 topranks: migrate link from cr1-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]] * 20:03 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] * 20:00 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 19:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:54 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 19:52 topranks: migrate link from cr2-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]] * 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1017.eqiad.wmnet with OS bullseye * 19:43 topranks: migrate link from cr1-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]] * 19:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage * 19:34 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] (duration: 07m 07s) * 19:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage * 19:30 dancy@deploy1003: jforrester, dancy: Continuing with deployment * 19:30 dancy@deploy1003: jforrester, dancy: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 19:27 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] * 19:26 topranks: migrate link from cr2-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]] * 19:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 19:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 19:06 topranks: migrate link from cr1-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]] * 19:05 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 18:42 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 18:35 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 18:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 18:08 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 18:08 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye * 17:56 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 17:56 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] (duration: 16m 08s) * 17:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 17:53 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:53 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:53 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:53 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:52 otto@deploy1003: otto: Continuing with deployment * 17:52 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:52 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:51 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:51 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:50 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:48 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:48 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:48 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:48 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:46 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:42 otto@deploy1003: otto: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 17:40 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] * 17:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:38 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:38 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:38 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:37 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 17:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:36 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply * 17:35 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply * 16:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm * 16:25 moritzm: installing Exim security updates on lists/vrts hosts * 16:00 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 15:57 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] (duration: 07m 22s) * 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 15:48 ladsgroup@deploy1003: ladsgroup, neriah: Continuing with deployment * 15:47 ladsgroup@deploy1003: ladsgroup, neriah: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:45 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] * 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:37 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 15:35 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 15:34 jelto: helm uninstall -n miscweb design-strategy - [[phab:T329991|T329991]] * 15:33 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 15:31 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 15:30 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 15:30 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 15:29 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 15:28 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 15:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm * 15:25 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 15:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 15:24 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts * 15:23 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s) * 15:23 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts * 15:22 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s) * 15:17 dancy@deploy1003: Installing scap version "4.264.0" for 163 host(s) * 15:12 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply * 15:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply * 15:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm * 14:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2220.codfw.wmnet with reason: Maintenance * 14:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 14:54 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 14:54 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:53 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bullseye * 14:47 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bullseye * 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 14:45 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 14:44 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 14:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bullseye * 14:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bullseye * 14:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage * 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs2009 to dse-k8s-wdqs-test2001 * 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test2001 * 14:33 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test2001 * 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test2001 on all recursors * 14:33 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test2001 on all recursors * 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003" * 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 14:32 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage * 14:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003" * 14:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs1028 to dse-k8s-wdqs-test1001 * 14:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage * 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test1001 * 14:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test1001 * 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test1001 on all recursors * 14:26 btullis@cumin1003: START - Cookbook sre.dns.netbox * 14:26 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test1001 on all recursors * 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003" * 14:26 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs2009 to dse-k8s-wdqs-test2001 * 14:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003" * 14:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage * 14:22 btullis@cumin1003: START - Cookbook sre.dns.netbox * 14:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:21 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs1028 to dse-k8s-wdqs-test1001 * 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage * 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage * 14:20 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage * 14:19 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage * 14:17 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply * 14:17 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply * 14:15 Lucas_WMDE: UTC afternoon backport+config window done * 14:15 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] (duration: 07m 02s) * 14:11 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment * 14:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:10 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:10 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1271 * 14:09 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] * 14:08 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply * 14:08 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/sessionstore: apply * 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bullseye * 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bullseye * 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bullseye * 14:07 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bullseye * 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply * 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply * 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply * 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply * 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] (duration: 39m 36s) * 14:06 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 14:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1271 * 14:05 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:05 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Rolling back deployment * 14:05 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 14:04 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1272 * 14:03 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1272 * 14:02 vriley@cumin1003: START - Cookbook sre.dns.netbox * 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003" * 14:02 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003" * 13:57 vriley@cumin1003: START - Cookbook sre.dns.netbox * 13:57 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 13:54 vriley@cumin1003: START - Cookbook sre.dns.netbox * 13:54 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 13:51 vriley@cumin1003: START - Cookbook sre.dns.netbox * 13:51 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm * 13:50 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync * 13:50 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync * 13:49 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye * 13:49 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync * 13:49 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync * 13:49 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync * 13:48 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync * 13:48 ottomata: roll restart eventgate main to pick up mediawiki/page/change/1.4.0 schema version for [[phab:T423583|T423583]] * 13:32 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 13:29 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:27 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] * 13:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2233.codfw.wmnet with reason: Reboot * 13:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2006.codfw.wmnet with reason: Reboot * 13:14 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] (duration: 07m 13s) * 13:09 sbisson@deploy1003: sbisson: Continuing with deployment * 13:08 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:06 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] * 12:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply * 12:38 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply * 12:26 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply * 12:26 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply * {{safesubst:SAL entry|1=12:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T42}} * 12:20 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment * 12:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425940)]] synced * {{safesubst:SAL entry|1=12:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425}} * 12:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] (duration: 07m 45s) * 12:06 kharlan@deploy1003: kharlan: Continuing with deployment * 12:04 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:02 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] * 10:31 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003" * 10:31 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003" * 09:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] (duration: 07m 43s) * 09:51 kharlan@deploy1003: kharlan: Continuing with deployment * 09:50 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 09:48 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] * 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92480 and previous config saved to /var/cache/conftool/dbconfig/20260512-092034-fceratto.json * 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92479 and previous config saved to /var/cache/conftool/dbconfig/20260512-091025-fceratto.json * 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92478 and previous config saved to /var/cache/conftool/dbconfig/20260512-090017-fceratto.json * 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92477 and previous config saved to /var/cache/conftool/dbconfig/20260512-085009-fceratto.json * 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92476 and previous config saved to /var/cache/conftool/dbconfig/20260512-083526-fceratto.json * 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance * 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2150: after reimage to trixie * 08:17 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] * 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1231: after reimage to trixie * 08:08 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply * 08:07 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply * 08:03 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 07m 02s) * 08:00 dcausse@deploy1003: dcausse: Rolling back deployment * 08:00 dcausse@deploy1003: dcausse: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] * 07:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2150: after reimage to trixie * 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2150.codfw.wmnet with OS trixie * 07:29 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1231: after reimage to trixie * 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1231.eqiad.wmnet with OS trixie * 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2150.codfw.wmnet with reason: host reimage * 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage * 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2150.codfw.wmnet with reason: host reimage * 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage * 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2142.codfw.wmnet * 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 06:46 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 06:43 jayme@deploy1003: Finished scap sync-world: update rsyslog image, [[phab:T418200|T418200]] (duration: 07m 56s) * 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1231.eqiad.wmnet with OS trixie * 06:42 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2150.codfw.wmnet with OS trixie * 06:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1231: Reimage to Trixie * 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Reimage to Trixie * 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1231: Reimage to Trixie * 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Reimage to Trixie * 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Reimage to Trixie * 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Reimage to Trixie * 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2142.codfw.wmnet * 06:36 jayme@deploy1003: Started scap sync-world: update rsyslog image, [[phab:T418200|T418200]] * 06:27 jayme@dns1004: END - running authdns-update * 06:26 jayme@dns1004: START - running authdns-update * 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] (duration: 36m 36s) * 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s) * 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 00:37 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply * 00:37 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply * 00:36 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply * 00:35 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply * 00:35 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply * 00:35 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply * 00:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply * 00:14 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply * 00:07 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] (duration: 07m 24s) * 00:03 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 00:02 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:00 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] == 2026-05-11 == * 23:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] (duration: 06m 21s) * 23:41 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 23:40 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:38 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] * 23:24 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] (duration: 06m 29s) * 23:20 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 23:19 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:18 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] * 21:51 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] (duration: 06m 26s) * 21:47 cjming@deploy1003: cjming: Continuing with deployment * 21:47 cjming@deploy1003: cjming: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:45 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] * 21:29 maryum: Deployed security fix for [[phab:T425406|T425406]] * 21:16 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply * 21:16 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply * 21:15 mstyles@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] (duration: 06m 36s) * 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:11 mstyles@deploy1003: sbassett, mstyles: Continuing with deployment * 21:10 mstyles@deploy1003: sbassett, mstyles: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:09 mstyles@deploy1003: Started scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] * 21:03 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply * 20:54 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:53 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply * 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003" * 20:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003" * 20:49 vriley@cumin1003: START - Cookbook sre.dns.netbox * 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS bookworm * 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:41 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] (duration: 09m 51s) * 20:37 jdrewniak@deploy1003: jdrewniak: Continuing with deployment * 20:36 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:33 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:32 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] * 20:19 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage * 20:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage * 20:02 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] (duration: 06m 57s) * 20:00 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS bookworm * 19:58 zabe@deploy1003: zabe: Continuing with deployment * 19:57 zabe@deploy1003: zabe: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 19:55 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] * 19:44 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye * 19:43 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:40 jmm@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bvibber out of all services on: 2453 hosts * 19:39 inflatador: [bking@cumin2002] ~$ sudo cumin 'A:wdqs-main and A:codfw' 'systemctl restart wdqs-blazegraph' <- restart after banning scraper * 19:25 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1269 * 19:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1269 * 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003" * 19:22 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003" * 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox * 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS bookworm * 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 19:16 dzahn@dns1005: END - running authdns-update * 19:14 dzahn@dns1005: START - running authdns-update * 19:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 19:11 inflatador: bking@archiva1002 `sudo rm -rfv /var/cache/archiva/temp* && sudo systemctl restart archiva`. to free up disk space * 18:56 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 18:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage * 18:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage * 18:25 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye * 18:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync * 18:13 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync * 18:12 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync * 18:12 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync * 18:12 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync * 18:12 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync * 18:12 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie * 18:12 ottomata: roll restarting eventgate-main to pick up changes for [[phab:T423952|T423952]] * 18:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm * 17:56 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1268.eqiad.wmnet with OS bookworm * 17:56 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm * 17:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 17:53 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 17:52 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye * 17:47 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply * 17:43 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply * 17:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 17:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92464 and previous config saved to /var/cache/conftool/dbconfig/20260511-173804-fceratto.json * 17:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1268 * 17:34 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1268 * 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003" * 17:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003" * 17:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92463 and previous config saved to /var/cache/conftool/dbconfig/20260511-172756-fceratto.json * 17:25 vriley@cumin1003: START - Cookbook sre.dns.netbox * 17:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92462 and previous config saved to /var/cache/conftool/dbconfig/20260511-171747-fceratto.json * 17:15 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 17:12 dancy@deploy1003: Installation of scap version "4.263.0" completed for 2 hosts * 17:11 dancy@deploy1003: Installing scap version "4.263.0" for 2 host(s) * 17:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92461 and previous config saved to /var/cache/conftool/dbconfig/20260511-170739-fceratto.json * 17:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply * 17:07 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply * 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 17:07 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 17:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 17:05 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie * 17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92460 and previous config saved to /var/cache/conftool/dbconfig/20260511-170024-fceratto.json * 17:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance * 16:56 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply * 16:51 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply * 16:50 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply * 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 16:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply * 16:39 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'. * 16:39 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'. * 16:39 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'. * 16:38 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'. * 16:37 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. * 16:37 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'. * 16:36 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. * 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'. * 16:27 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] (duration: 06m 54s) * 16:25 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply * 16:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply * 16:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply * 16:23 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply * 16:23 zabe@deploy1003: zabe: Continuing with deployment * 16:22 zabe@deploy1003: zabe: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 16:20 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] * 16:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 16:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 16:02 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 16:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 16:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 16:00 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply * 16:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply * 15:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] (duration: 07m 48s) * 15:54 zabe@deploy1003: zabe: Continuing with deployment * 15:52 zabe@deploy1003: zabe: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:50 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] * 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:46 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] (duration: 06m 32s) * 15:42 zabe@deploy1003: zabe: Continuing with deployment * 15:41 zabe@deploy1003: zabe: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] * 15:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 15:17 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm * 14:55 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lvs2012.codfw.wmnet with reason: DIMM replacement * 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 14:47 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 14:43 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs1017 * 14:42 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host lvs1017 * 14:42 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 14:41 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 14:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:39 Lucas_WMDE: UTC afternoon backport+config window done * 14:39 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]] (duration: 18 * 14:38 vriley@cumin1003: START - Cookbook sre.dns.netbox * 14:33 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Continuing with deployment * {{safesubst:SAL entry|1=14:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now}} * 14:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]] * {{safesubst:SAL entry|1=14:18 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (}} * 14:15 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm * 14:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 14:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 14:05 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Continuing with deployment * {{safesubst:SAL entry|1=14:04 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group}} * 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-eqiad@eqiad * 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs * 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS bookworm * 13:56 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs * 13:50 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-eqiad@eqiad * 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-codfw@codfw * 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs * 13:49 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs * 13:47 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-codfw@codfw * 13:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage * {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (T}} * 13:36 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage * 13:34 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 13:34 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 13:32 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 13:32 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 13:30 btullis: restarting pybal on lvs1019 and lvs1020 for [[phab:T420437|T420437]] * 13:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] (duration: 06m 28s) * 13:25 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 13:24 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS bookworm * 13:22 jiji@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc1055.eqiad.wmnet with OS trixie * 13:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment * 13:21 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:21 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 13:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] * 13:19 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'. * 13:19 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'. * 13:18 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'. * 13:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'. * 13:16 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. * 13:15 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'. * 13:14 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. * 13:14 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'. * 13:07 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] (duration: 08m 05s) * 13:06 elukey: remove old discovery pki intermediate * 13:03 otto@deploy1003: otto: Continuing with deployment * 13:01 otto@deploy1003: otto: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:59 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] * 12:59 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. * 12:58 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'. * 12:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] (duration: 12m 07s) * 12:47 kharlan@deploy1003: kharlan: Continuing with deployment * 12:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] * 12:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage * 12:18 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage * 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie * 12:04 topranks: push out updated ACL to Nokia switches for BGP connections ([[phab:T425703|T425703]]) and add BFD config ([[phab:T425813|T425813]]) * 11:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2185.codfw.wmnet with reason: Reboot * 11:31 moritzm: installing Linux 6.12.86 on Trixie hosts * 11:27 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply * 11:27 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply * 11:21 jayme@deploy1003: Finished scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]] (duration: 13m 28s) * 11:21 jayme@deploy1003: Rolling back deployment * 11:08 jayme@deploy1003: Started scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]] * 11:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance * 11:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance * 10:59 jayme: uprading rsyslog to 8.2504.0-1 in all mediawiki deployments - [[phab:T418200|T418200]] * 10:52 taavi@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Clément Goubert out of all services on: 2459 hosts * 10:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance * 10:26 jayme@deploy1003: Finished scap sync-world: update rsyslog image (duration: 03m 48s) * 10:23 jayme@deploy1003: Started scap sync-world: update rsyslog image * 10:22 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply * 10:21 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply * 10:21 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 10:21 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 10:16 slyngs: Migrate of lvs2012 due to hardware issues * 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 10:11 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] (duration: 30m 15s) * 10:10 moritzm: rebalance routed Ganeti cluster in eqsin [[phab:T421863|T421863]] * 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance * 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance * 10:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance * 10:01 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance * 09:59 kharlan@deploy1003: kharlan: Continuing with deployment * 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:58 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 09:58 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 09:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure * 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure * 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1230: [[phab:T419635|T419635]] * 09:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] * 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:37 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:25 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92456 and previous config saved to /var/cache/conftool/dbconfig/20260511-092010-fceratto.json * 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92454 and previous config saved to /var/cache/conftool/dbconfig/20260511-091001-fceratto.json * 09:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 09:08 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:07 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install5004.wikimedia.org to drbd * 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92453 and previous config saved to /var/cache/conftool/dbconfig/20260511-085954-fceratto.json * 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 08:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1230: [[phab:T419635|T419635]] * 08:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance * 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance * 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92451 and previous config saved to /var/cache/conftool/dbconfig/20260511-084945-fceratto.json * 08:43 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install5004.wikimedia.org to drbd * 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92450 and previous config saved to /var/cache/conftool/dbconfig/20260511-084236-fceratto.json * 08:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01 * 08:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance * 08:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01 * 08:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet * 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet * 08:10 slyngshede@dns1004: END - running authdns-update * 08:08 slyngshede@dns1004: START - running authdns-update * 08:05 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 08:05 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003" * 08:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003" * 07:56 ayounsi@cumin1003: START - Cookbook sre.dns.netbox * 07:55 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 07:50 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 07:49 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 07:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 07:48 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 07:47 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 07:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 07:23 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 07:08 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) zarcillo.discovery.wmnet on all recursors * 07:08 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache zarcillo.discovery.wmnet on all recursors * 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm * 06:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage * 06:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage * 06:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief2002.codfw.wmnet * 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM acmechief2002.codfw.wmnet * 06:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast6003.wikimedia.org * 05:57 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast6003.wikimedia.org * 05:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image == 2026-05-10 == * 18:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]] * 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]] * 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]] * 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]] * 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]] * 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]] * 18:09 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]] * 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image == 2026-05-09 == * 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003" * 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003 * 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003 * 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003" * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s) * 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS bookworm * 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage * 00:44 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage * 00:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS bookworm * 00:17 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED == 2026-05-08 == * 23:55 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1267 * 23:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1267 * 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003" * 23:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003" * 23:26 vriley@cumin1003: START - Cookbook sre.dns.netbox * 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS bookworm * 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 23:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 22:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage * 22:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage * 22:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS bookworm * 22:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1266 * 21:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1266 * 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003" * 21:51 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003" * 21:45 vriley@cumin1003: START - Cookbook sre.dns.netbox * 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS bookworm * 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage * 21:19 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage * 20:54 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS bookworm * 20:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:32 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1265 * 20:30 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1265 * 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003" * 20:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003" * 20:24 vriley@cumin1003: START - Cookbook sre.dns.netbox * 20:01 ryankemper: [WDQS] Added several more requestctl rules. They've helped marginally, but not enough to restore the service. Unless we find an obvious smoking gun, expect noise to continue for the timebeing :/ * 19:42 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 19:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 19:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 19:40 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 18:07 ryankemper: [WDQS] After those 2 requestctl rules, requests went down 20%, error rate decreased significantly, p50 cut almost in half, but the service is still unstable, likely we'll need to identify more throttle-candidates to restore full health * 17:53 ryankemper: [WDQS] Deployed 2 new requestctl rules; we'll see if it helps * 16:51 topranks: enable bfd on system0.0 sub-interface ssw1-d1-eqiad * 15:45 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup1003.eqiad.wmnet with reason: restart * 15:37 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup[1006,1017-1018].eqiad.wmnet with reason: restart * 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet * 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet * 14:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 10:51 btullis: re-pooled wdqs-main in eqiad for [[phab:T425758|T425758]] * 10:50 btullis@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=wdqs-main,name=eqiad * 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 10:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup1007.eqiad.wmnet with reason: restart * 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:44 btullis: depooled wdqs-main in eqiad for [[phab:T425758|T425758]] * 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:40 btullis@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=wdqs-main,name=eqiad * 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92437 and previous config saved to /var/cache/conftool/dbconfig/20260508-093251-fceratto.json * 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92435 and previous config saved to /var/cache/conftool/dbconfig/20260508-092243-fceratto.json * 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92434 and previous config saved to /var/cache/conftool/dbconfig/20260508-091238-fceratto.json * 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92433 and previous config saved to /var/cache/conftool/dbconfig/20260508-090230-fceratto.json * 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92432 and previous config saved to /var/cache/conftool/dbconfig/20260508-085217-fceratto.json * 08:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance * 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92431 and previous config saved to /var/cache/conftool/dbconfig/20260508-085018-fceratto.json * 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92430 and previous config saved to /var/cache/conftool/dbconfig/20260508-084010-fceratto.json * 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92429 and previous config saved to /var/cache/conftool/dbconfig/20260508-083003-fceratto.json * 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92428 and previous config saved to /var/cache/conftool/dbconfig/20260508-081954-fceratto.json * 08:18 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 08:17 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92427 and previous config saved to /var/cache/conftool/dbconfig/20260508-080438-fceratto.json * 08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance * 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 07:56 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install5003.wikimedia.org * 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 07:09 jmm@cumin2002: START - Cookbook sre.dns.netbox * 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2159: after reimage to trixie * 06:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts install5003.wikimedia.org * 06:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2159: after reimage to trixie * 06:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2159.codfw.wmnet with OS trixie * 06:11 moritzm: installing postorius security updates * 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2159.codfw.wmnet with reason: host reimage * 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2159.codfw.wmnet with reason: host reimage * 05:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2159.codfw.wmnet with OS trixie * 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2159: Reimage to Trixie * 05:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2159: Reimage to Trixie * 05:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Reimage to Trixie * 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1024.eqiad.wmnet with OS trixie * 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 03:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage * 02:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage * 02:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1024.eqiad.wmnet with OS trixie * 02:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:07 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:07 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1024 * 02:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1024 * 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003" * 02:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003" * 02:01 vriley@cumin1003: START - Cookbook sre.dns.netbox * 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1023.eqiad.wmnet with OS trixie * 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 01:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 01:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage * 01:11 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage * 00:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1023.eqiad.wmnet with OS trixie * 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:37 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:37 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1023 * 00:36 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1023 * 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003" * 00:27 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003" * 00:20 vriley@cumin1003: START - Cookbook sre.dns.netbox == 2026-05-07 == * 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1022.eqiad.wmnet with OS trixie * 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 23:24 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 23:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage * 23:05 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage * 22:53 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie * 22:25 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19] (duration: 01m 53s) * 22:23 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19] * 22:23 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19] (duration: 03m 52s) * 22:19 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19] * 22:18 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19] (duration: 01m 55s) * 22:16 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19] * {{safesubst:SAL entry|1=21:27 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)}} * 21:23 cscott@deploy1003: cscott: Continuing with deployment * 21:17 cscott@deploy1003: cscott: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]] synced to the t * {{safesubst:SAL entry|1=21:16 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]}} * 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1021.eqiad.wmnet with OS trixie * 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:49 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] (duration: 06m 38s) * 20:48 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:45 kemayo@deploy1003: esanders, kemayo: Continuing with deployment * 20:44 kemayo@deploy1003: esanders, kemayo: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v * 20:42 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] * 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php commonswiki * 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php nlwiki * 20:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage * 20:30 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage * 20:29 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 20:28 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 20:14 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] (duration: 07m 18s) * 20:10 arlolra@deploy1003: arlolra, mmartorana: Continuing with deployment * 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie * 20:09 arlolra@deploy1003: arlolra, mmartorana: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:07 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] * 20:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1022.eqiad.wmnet with OS trixie * 19:59 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 19:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 19:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie * 19:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 18:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 18:51 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1022 * 18:49 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1022 * 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003" * 18:49 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003" * 18:45 vriley@cumin1003: START - Cookbook sre.dns.netbox * 18:26 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply * 18:26 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/editor-analytics: apply * 18:25 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply * 18:24 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply * 18:22 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply * 18:22 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply * 18:21 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply * 18:21 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply * 18:20 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply * 18:19 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/geo-analytics: apply * 18:19 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply * 18:18 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply * 18:17 brennen@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] * 18:06 cdanis@dns1005: END - running authdns-update * 18:04 cdanis@dns1005: START - running authdns-update * 18:02 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] (duration: 29m 24s) * 18:02 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to all wikis * 17:59 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply * 17:58 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply * 17:51 krinkle@deploy1003: krinkle: Continuing with deployment * 17:50 krinkle@deploy1003: krinkle: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 17:45 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply * 17:45 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply * 17:33 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] * 17:32 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply * 17:32 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply * 17:06 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2200.codfw.wmnet,db1216.eqiad.wmnet with reason: restart * 16:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2199.codfw.wmnet,db1245.eqiad.wmnet with reason: restart * 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 16:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 16:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 16:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 16:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 16:32 jynus: restarting backup1-* database primary hosts * 16:30 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2183.codfw.wmnet,db1204.eqiad.wmnet with reason: restart * 16:25 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 36 hosts with reason: restart * 16:14 sukhe@dns1004: END - running authdns-update * 16:13 sukhe@dns1004: START - running authdns-update * 16:13 sukhe@dns1004: START - running authdns-update * 16:12 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 16:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox) * 16:01 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 15:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet,ms-backup[1003-1004].eqiad.wmnet with reason: restart * 15:44 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox) * 15:32 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply * 15:32 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply * 15:31 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified] * 15:31 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply * 15:31 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified] * 15:31 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply * 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica * 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica * 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica * 15:23 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica * 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica * 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica * 15:22 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 39 hosts * 15:22 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 39 hosts * 15:18 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica * 15:18 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica * 15:15 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.ulsfo.wmnet * 15:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified] * 15:12 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified] * 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 15:05 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 15:05 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 15:03 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply * 15:03 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply * 15:03 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 15:01 akhatun: Deployed refinery using scap, then deployed onto hdfs * 14:58 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad * 14:54 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply * 14:54 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply * 14:54 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply * 14:54 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply * 14:53 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply * 14:53 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply * 14:52 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 14:52 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply * 14:52 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply * 14:50 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 14:44 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c] (duration: 02m 01s) * 14:43 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply * 14:43 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply * 14:42 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c] * 14:40 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c] (duration: 04m 38s) * 14:40 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad * 14:37 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply * 14:36 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply * 14:36 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c] * 14:35 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply * 14:35 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply * 14:33 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c] (duration: 01m 54s) * 14:32 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]] * 14:32 slyngshede@dns1004: END - running authdns-update * 14:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:31 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c] * 14:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:31 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply * 14:30 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply * 14:30 slyngshede@dns1004: START - running authdns-update * 14:30 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply * 14:30 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply * 14:30 akhatun: Deploying Refinery at {{Gerrit|4734c67}} for weekly deployment train * 14:30 jmm@dns1004: END - running authdns-update * 14:29 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply * 14:28 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply * 14:28 jmm@dns1004: START - running authdns-update * 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003" * 14:28 slyngshede@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003" * 14:26 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply * 14:26 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply * 14:25 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply * 14:25 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply * 14:24 slyngshede@cumin1003: START - Cookbook sre.dns.netbox * 14:12 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw * 14:12 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply * 14:12 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply * 14:10 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply * 14:10 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply * 13:53 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw * 13:34 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] (duration: 09m 05s) * 13:30 stran@deploy1003: stran: Continuing with deployment * 13:27 stran@deploy1003: stran: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] * 13:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 13:10 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] (duration: 06m 55s) * 13:06 jforrester@deploy1003: rzl, jforrester, hartman: Continuing with deployment * 13:05 jforrester@deploy1003: rzl, jforrester, hartman: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:03 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] * 13:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=dns4004.wikimedia.org [reason: ulsfo switch refresh [[phab:T408892|T408892]]] * 12:58 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 12:55 sukhe@cumin1003: START - Cookbook sre.dns.netbox * 12:51 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 12:51 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 12:51 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 12:50 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 12:45 sukhe@dns1004: FAIL - running authdns-update * 12:44 sukhe@dns1004: START - running authdns-update * 12:30 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1205.eqiad.wmnet with OS trixie * 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install5004.wikimedia.org * 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install5004.wikimedia.org with OS bookworm * 12:23 slyngshede@dns1004: FAIL - running authdns-update * 12:21 slyngshede@dns1004: START - running authdns-update * 12:18 moritzm: installing init-system-helpers bugfix updates from Bookworm point release * 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003" * 12:17 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003" * 12:12 slyngshede@dns1004: FAIL - running authdns-update * 12:11 slyngshede@dns1004: START - running authdns-update * 12:11 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply * 12:11 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply * 12:11 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo,service=authdns-update [reason: ulsfo switch refresh [[phab:T408892|T408892]]] * 12:08 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage * 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2168: after reimage to trixie * 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install5004.wikimedia.org with reason: host reimage * 12:02 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 12:02 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 12:02 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage * 12:00 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install5004.wikimedia.org with reason: host reimage * 11:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: after reimage to trixie * 11:47 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1205.eqiad.wmnet with OS trixie * 11:46 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1205.eqiad.wmnet with reason: reimage * 11:43 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 11:43 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 11:40 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS trixie * 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org * 11:36 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 11:35 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 11:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org * 11:20 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2168: after reimage to trixie * 11:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage * 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2168.codfw.wmnet with OS trixie * 11:16 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 11:15 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 11:15 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage * 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92412 and previous config saved to /var/cache/conftool/dbconfig/20260507-111424-fceratto.json * 11:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1227: after reimage to trixie * 11:11 moritzm: instaling modsecurity-apache security updates * 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1227.eqiad.wmnet with OS trixie * 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install5004.wikimedia.org with OS bookworm * 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92409 and previous config saved to /var/cache/conftool/dbconfig/20260507-110415-fceratto.json * 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002" * 11:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002" * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors * 11:03 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 11:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 10:59 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply * 10:59 jmm@cumin2002: START - Cookbook sre.dns.netbox * 10:59 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org * 10:58 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply * 10:58 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2184 * 10:58 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2184 * 10:57 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2184 * 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 10:57 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org * 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors * 10:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002" * 10:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002" * 10:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] (duration: 08m 40s) * 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2168.codfw.wmnet with reason: host reimage * 10:54 root@cumin1003: START - Cookbook sre.dns.netbox * 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92407 and previous config saved to /var/cache/conftool/dbconfig/20260507-105407-fceratto.json * 10:51 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 10:51 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2168.codfw.wmnet with reason: host reimage * 10:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 10:49 jmm@cumin2002: START - Cookbook sre.dns.netbox * 10:49 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2184 * 10:48 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS trixie * 10:48 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage * 10:48 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 10:47 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 10:47 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 10:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] * 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92406 and previous config saved to /var/cache/conftool/dbconfig/20260507-104359-fceratto.json * 10:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage * 10:40 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2184.codfw.wmnet with reason: reimage * 10:40 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox * 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors * 10:39 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 10:39 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 10:35 jmm@cumin2002: START - Cookbook sre.dns.netbox * 10:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org * 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92405 and previous config saved to /var/cache/conftool/dbconfig/20260507-103349-fceratto.json * 10:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance * 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2168.codfw.wmnet with OS trixie * 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5002.wikimedia.org * 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 10:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2168: Reimage to Trixie * 10:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2168: Reimage to Trixie * 10:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Reimage to Trixie * 10:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2182: after reimage to trixie * 10:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1227.eqiad.wmnet with OS trixie * 10:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1227: Reimage to Trixie * 10:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1227: Reimage to Trixie * 10:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Reimage to Trixie * 10:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 10:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: after reimage to trixie * 10:21 jmm@cumin2002: START - Cookbook sre.dns.netbox * 10:21 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply * 10:20 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply * 10:16 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5002.wikimedia.org * 10:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]] * 10:14 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]] * 10:13 moritzm: rebalance ganti cluster in ulsfo following host reimages [[phab:T424686|T424686]] * 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5001.wikimedia.org * 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 10:11 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS trixie * 10:10 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 10:04 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 10:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 10:03 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 09:59 jmm@cumin2002: START - Cookbook sre.dns.netbox * 09:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5001.wikimedia.org * 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage * 09:49 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage * 09:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2182: after reimage to trixie * 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd * 09:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2182.codfw.wmnet with OS trixie * 09:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1202: after reimage to trixie * 09:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1202.eqiad.wmnet with OS trixie * 09:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance * 09:32 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd * 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus4003.ulsfo.wmnet to drbd * 09:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1006.eqiad.wmnet * 09:24 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1006.eqiad.wmnet * 09:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie * 09:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2182.codfw.wmnet with reason: host reimage * 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast4006.wikimedia.org * 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage * 09:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2182.codfw.wmnet with reason: host reimage * 09:11 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast4006.wikimedia.org * 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2208: After reimage * 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage * 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2182.codfw.wmnet with OS trixie * 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1202.eqiad.wmnet with OS trixie * 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Reimage to Trixie * 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2182: Reimage to Trixie * 08:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2182: Reimage to Trixie * 08:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Reimage to Trixie * 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Reimage to Trixie * 08:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Reimage to Trixie * 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2144.codfw.wmnet * 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 08:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 08:37 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 08:32 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2144.codfw.wmnet * 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus4003.ulsfo.wmnet to drbd * 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4004.ulsfo.wmnet to drbd * 08:28 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2144 [[phab:T425522|T425522]]', diff saved to https://phabricator.wikimedia.org/P92389 and previous config saved to /var/cache/conftool/dbconfig/20260507-082822-marostegui.json * 08:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage * 08:23 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2208: After reimage * 08:23 XioNoX: drmrs remove old v6 gateway IP * 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003" * 08:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage * 08:21 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003" * 08:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox * 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4004.ulsfo.wmnet to drbd * 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01 * 08:12 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync * 08:12 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync * 08:12 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01 * 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet * 08:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet * 08:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01 * 08:03 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01 * 07:54 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] (duration: 09m 46s) * 07:49 dcausse@deploy1003: dcausse: Continuing with deployment * 07:46 dcausse@deploy1003: dcausse: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow4003.ulsfo.wmnet to drbd * 07:44 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] * 07:32 moritzm: installing apache2 security updates * 07:30 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow4003.ulsfo.wmnet to drbd * 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2005.codfw.wmnet * 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2005.codfw.wmnet * 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4003.ulsfo.wmnet to drbd * 06:48 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd * 06:46 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of ncredir4003.ulsfo.wmnet to drbd * 06:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd * 06:42 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01 * 06:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01 * 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2207: after reimage to trixie * 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2207: after reimage to trixie * 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2207.codfw.wmnet with OS trixie * 05:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2208.codfw.wmnet with OS trixie * 05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2207.codfw.wmnet with reason: host reimage * 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2207.codfw.wmnet with reason: host reimage * 05:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2208.codfw.wmnet with reason: host reimage * 05:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2208.codfw.wmnet with reason: host reimage * 05:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2207.codfw.wmnet with OS trixie * 05:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2207: Reimage to Trixie * 05:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2207: Reimage to Trixie * 05:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Reimage to Trixie * 04:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2207 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92383 and previous config saved to /var/cache/conftool/dbconfig/20260507-045219-marostegui.json * 04:51 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2204 to s2 primary [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92382 and previous config saved to /var/cache/conftool/dbconfig/20260507-045141-marostegui.json * 04:51 marostegui: Starting s2 codfw failover from db2207 to db2204 - [[phab:T424848|T424848]] * 04:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T424848|T424848]] * 04:46 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2204 with weight 0 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92381 and previous config saved to /var/cache/conftool/dbconfig/20260507-044651-marostegui.json * 04:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s) * 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:15 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] (duration: 12m 57s) * 01:09 zabe@deploy1003: zabe: Continuing with deployment * 01:09 zabe@deploy1003: zabe: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 01:02 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] * 01:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie * 00:43 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] (duration: 33m 54s) * 00:31 zabe@deploy1003: zabe: Continuing with deployment * 00:29 zabe@deploy1003: zabe: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:10 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] == 2026-05-06 == * 23:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie * 23:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie * 23:14 ladsgroup@deploy1003: Synchronized portals: Sync portals for removal of Wikinews (duration: 02m 22s) * 23:12 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Sync portals for removal of Wikinews (duration: 06m 12s) * 22:50 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] (duration: 07m 08s) * 22:46 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 22:45 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:43 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] * 22:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] (duration: 06m 40s) * 22:28 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 22:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] * 22:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie * 22:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 22:14 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] (duration: 06m 25s) * 22:11 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 22:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 22:10 cjming@deploy1003: cjming: Continuing with deployment * 22:10 cjming@deploy1003: cjming: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 22:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 22:08 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] * 22:06 vriley@cumin1003: START - Cookbook sre.dns.netbox * 22:05 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021 * 22:04 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021 * 21:52 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] (duration: 06m 56s) * 21:48 zabe@deploy1003: zabe: Continuing with deployment * 21:47 zabe@deploy1003: zabe: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:45 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] * 21:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:28 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS trixie * 21:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:24 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:22 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:17 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:15 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003" * 21:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003" * 21:07 vriley@cumin1003: START - Cookbook sre.dns.netbox * 21:06 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021 * 21:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021 * 21:04 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage * 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:28 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] (duration: 09m 12s) * 20:25 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage * 20:24 catrope@deploy1003: catrope, somerandomdeveloper: Continuing with deployment * 20:21 catrope@deploy1003: catrope, somerandomdeveloper: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:19 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] * 20:14 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie * 20:00 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1012.eqiad.wmnet with OS trixie * 19:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie * 19:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS trixie * 19:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie * 19:23 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie * 19:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS bookworm * 19:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage * 19:01 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage * 18:59 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 18:59 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 18:59 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 18:59 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 18:59 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 18:59 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 18:59 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage * 18:55 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 18:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 18:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 18:54 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage * 18:54 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage * 18:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 18:53 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 18:53 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 18:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie * 18:48 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage * 18:47 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 18:47 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 18:42 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 18:42 brennen@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] * 18:42 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 18:41 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 18:40 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 18:40 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 18:39 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 18:37 dzahn@dns1005: END - running authdns-update * 18:35 dzahn@dns1005: START - running authdns-update * 18:33 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to group1 * 18:31 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie * 18:29 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS bookworm * 18:02 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo * 18:01 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo * 17:59 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002 * 17:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo * 17:55 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo * 17:37 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply * 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply * 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply * 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply * 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply * 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply * 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply * 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply * 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply * 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply * 17:32 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply * 17:31 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply * 17:28 topranks: rebooting asw1-23-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]] * 17:27 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-23-ulsfo,asw1-23-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo * 17:20 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply * 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply * 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply * 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply * 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply * 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply * 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply * 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply * 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply * 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply * 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply * 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply * 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply * 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply * 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply * 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply * 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply * 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply * 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply * 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply * 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply * 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply * 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply * 17:06 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply * 17:02 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on 39 hosts with reason: ulsfo depooled for switch work * 16:53 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-22-ulsfo,asw1-22-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo * 16:52 topranks: rebooting asw1-22-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]] * 16:45 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 16:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS trixie * 16:39 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 16:37 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns4004.wikimedia.org with OS bookworm * 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 16:28 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS bookworm * 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 16:28 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 16:09 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage * 16:04 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns4004.wikimedia.org with reason: host reimage * 15:58 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage * 15:57 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns4004.wikimedia.org with reason: host reimage * 15:38 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS bookworm * 15:35 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host dns4004.wikimedia.org with OS bookworm * 15:30 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution. * 15:08 sukhe: sudo cumin -b1 -s5 "C:bird and not dns4004*" "run-puppet-agent --enable 'merging CR 1282958'" * 15:08 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution. * 15:06 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] (duration: 06m 41s) * 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 15:01 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] * 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5002.eqsin.wmnet * 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 14:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie * 14:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 14:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:41 jmm@cumin2002: START - Cookbook sre.dns.netbox * 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:35 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns7001.wikimedia.org [reason: testing bird change] * 14:31 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=dns7001.wikimedia.org [reason: testing bird change] * 14:30 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] (duration: 11m 16s) * 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage * 14:26 kharlan@deploy1003: kharlan: Continuing with deployment * 14:25 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1282958'" * 14:23 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage * 14:22 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 14:21 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 14:21 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 14:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5002.eqsin.wmnet * 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm * 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002" * 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002" * 14:20 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 14:19 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] * 14:19 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 14:18 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5001.eqsin.wmnet * 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 14:15 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] (duration: 06m 40s) * 14:13 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 14:13 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 14:12 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 14:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 14:11 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 14:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie * 14:10 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 14:10 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 14:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:10 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 14:09 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:08 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] * 14:08 jmm@cumin2002: START - Cookbook sre.dns.netbox * 14:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] (duration: 11m 28s) * 14:02 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5001.eqsin.wmnet * 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage * 13:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 13:56 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage * 13:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:55 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie * 13:53 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002 * 13:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1194: after reimage to trixie * 13:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] * 13:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage * 13:45 jgreen@dns1004: END - running authdns-update * 13:44 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] (duration: 30m 53s) * 13:44 jgreen@dns1004: START - running authdns-update * 13:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage * 13:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage * 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm * 13:35 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.ulsfo.wmnet on all recursors * 13:34 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.ulsfo.wmnet on all recursors * 13:32 alexsanford@deploy1003: alexsanford: Continuing with deployment * 13:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:31 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage * 13:31 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet'] * 13:28 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet'] * 13:28 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet'] * 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet'] * 13:27 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet'] * 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet'] * 13:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:26 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:24 pt1979@cumin2002: START - Cookbook sre.dns.netbox * 13:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:20 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:19 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.mgmt.ulsfo.wmnet on all recursors * 13:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie * 13:19 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.mgmt.ulsfo.wmnet on all recursors * 13:19 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003" * 13:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003" * 13:15 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 13:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage * 13:14 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 13:13 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] * 13:12 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS trixie * 13:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:08 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage * 13:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1194: after reimage to trixie * 13:05 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 13:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1194.eqiad.wmnet with OS trixie * 12:49 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS trixie * 12:45 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 12:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2012.codfw.wmnet with OS trixie * 12:39 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 12:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage * 12:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: update * 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage * 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage * 12:21 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage * 12:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1194.eqiad.wmnet with OS trixie * 12:20 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie * 12:16 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie * 12:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 12:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2011.codfw.wmnet with OS trixie * 12:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] (duration: 06m 28s) * 12:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 12:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 12:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2012.codfw.wmnet with OS trixie * 12:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] * 12:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 11:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage * 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet * 11:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage * 11:50 moritzm: installing openjdk-17 security updates * 11:50 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage * 11:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92374 and previous config saved to /var/cache/conftool/dbconfig/20260506-114919-fceratto.json * 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet * 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1194: Reimage to Trixie * 11:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2160.codfw.wmnet with reason: Reboot * 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1194: Reimage to Trixie * 11:44 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage * 11:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Reimage to Trixie * 11:42 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage * 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm * 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002" * 11:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage * 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92372 and previous config saved to /var/cache/conftool/dbconfig/20260506-113910-fceratto.json * 11:30 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2011.codfw.wmnet with OS trixie * 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92371 and previous config saved to /var/cache/conftool/dbconfig/20260506-112903-fceratto.json * 11:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002" * 11:20 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie * 11:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie * 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92370 and previous config saved to /var/cache/conftool/dbconfig/20260506-111854-fceratto.json * 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie * 11:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie * 11:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1217.eqiad.wmnet with reason: Reboot * 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage * 10:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage * 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage * 10:44 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage * 10:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage * 10:39 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage * 10:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm * 10:29 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet'] * 10:23 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet'] * 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet'] * 10:22 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet'] * 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92369 and previous config saved to /var/cache/conftool/dbconfig/20260506-101836-fceratto.json * 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance * 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92368 and previous config saved to /var/cache/conftool/dbconfig/20260506-101808-fceratto.json * 10:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie * 10:16 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie * 10:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie * 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92367 and previous config saved to /var/cache/conftool/dbconfig/20260506-100800-fceratto.json * 09:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92366 and previous config saved to /var/cache/conftool/dbconfig/20260506-095752-fceratto.json * 09:55 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 09:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92365 and previous config saved to /var/cache/conftool/dbconfig/20260506-094744-fceratto.json * 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage * 09:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage * 09:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 09:32 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 09:31 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003" * 09:27 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92364 and previous config saved to /var/cache/conftool/dbconfig/20260506-092414-fceratto.json * 09:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance * 09:23 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003" * 09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92363 and previous config saved to /var/cache/conftool/dbconfig/20260506-092345-fceratto.json * 09:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox * 09:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie * 09:16 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97) * 09:15 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2005.codfw.wmnet with reason: update * 09:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms2 [[phab:T418979|T418979]]ç', diff saved to https://phabricator.wikimedia.org/P92362 and previous config saved to /var/cache/conftool/dbconfig/20260506-091513-marostegui.json * 09:14 ayounsi@cumin1003: START - Cookbook sre.dns.netbox * 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2253: Replacing HW [[phab:T418979|T418979]] * 09:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99) * 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2253: Replacing HW [[phab:T418979|T418979]] * 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92361 and previous config saved to /var/cache/conftool/dbconfig/20260506-091337-fceratto.json * 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92360 and previous config saved to /var/cache/conftool/dbconfig/20260506-090329-fceratto.json * 09:03 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] (duration: 08m 44s) * 08:59 zabe@deploy1003: zabe: Continuing with deployment * 08:56 zabe@deploy1003: zabe: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 08:54 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] * 08:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92359 and previous config saved to /var/cache/conftool/dbconfig/20260506-085321-fceratto.json * 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2253 to ms2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92358 and previous config saved to /var/cache/conftool/dbconfig/20260506-084337-marostegui.json * 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92357 and previous config saved to /var/cache/conftool/dbconfig/20260506-083841-fceratto.json * 08:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance * 08:29 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie * 08:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie * 08:08 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie * 08:06 awight: EU morning deployment is done * 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Replacing hw * 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Replacing HW [[phab:T418979|T418979]] * 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0) * 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Replacing HW [[phab:T418979|T418979]] * 07:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie * 07:40 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] (duration: 08m 58s) * 07:36 awight@deploy1003: wmde-fisch, awight, dcausse: Continuing with deployment * 07:33 awight@deploy1003: wmde-fisch, awight, dcausse: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can * 07:31 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] * 07:26 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] (duration: 07m 37s) * 07:22 awight@deploy1003: awight, lilients: Continuing with deployment * 07:21 awight@deploy1003: awight, lilients: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:19 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] * 07:14 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4008.ulsfo.wmnet * 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 06:55 jmm@cumin2002: START - Cookbook sre.dns.netbox * 06:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1191: after reimage to trixie * 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1189: after reimage to trixie * 06:48 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4008.ulsfo.wmnet * 06:48 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4006.ulsfo.wmnet * 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 06:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 06:26 jmm@cumin2002: START - Cookbook sre.dns.netbox * 06:20 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4006.ulsfo.wmnet * 05:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2208.codfw.wmnet with reason: Idrac issues [[phab:T425506|T425506]] * 05:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage * 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage * 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage * 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage * 05:26 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2208: Reimage to Trixie * 05:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie * 05:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie * 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2208: Reimage to Trixie * 05:24 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie * 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie * 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1191.eqiad.wmnet with OS trixie * 05:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1191: Reimage to Trixie * 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1191: Reimage to Trixie * 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Reimage to Trixie * 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1189.eqiad.wmnet with OS trixie * 05:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1189: Reimage to Trixie * 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1189: Reimage to Trixie * 05:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1189.eqiad.wmnet with reason: Reimage to Trixie * 05:11 marostegui@dns1004: END - running authdns-update * 05:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1189 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92345 and previous config saved to /var/cache/conftool/dbconfig/20260506-050948-marostegui.json * 05:09 marostegui@dns1004: START - running authdns-update * 05:08 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1223 to s3 primary and set section read-write [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92344 and previous config saved to /var/cache/conftool/dbconfig/20260506-050816-marostegui.json * 05:07 marostegui@cumin1003: dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92343 and previous config saved to /var/cache/conftool/dbconfig/20260506-050755-marostegui.json * 05:06 marostegui: Starting s3 eqiad failover from db1189 to db1223 - [[phab:T425318|T425318]] * 05:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T425318|T425318]] * 05:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1223 with weight 0 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92342 and previous config saved to /var/cache/conftool/dbconfig/20260506-050342-marostegui.json * 03:28 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 37s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:05 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1001.eqiad.wmnet with OS trixie * 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] (duration: 06m 26s) * 00:49 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 00:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] * 00:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage * 00:41 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage * 00:27 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] (duration: 07m 26s) * 00:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1001 * 00:25 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1001 * 00:24 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1001.eqiad.wmnet with OS trixie * 00:23 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 00:21 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:20 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] == 2026-05-05 == * 23:31 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 23:30 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002" * 23:30 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002" * 23:26 pt1979@cumin2002: START - Cookbook sre.dns.netbox * 22:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] (duration: 06m 58s) * 22:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 22:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] * 22:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] (duration: 06m 28s) * 22:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 22:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:37 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] * 22:26 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] (duration: 07m 56s) * 22:22 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 22:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] * 22:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] (duration: 06m 45s) * 22:12 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 22:11 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] * 22:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] (duration: 11m 07s) * 21:59 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 21:58 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:54 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] * 21:49 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] (duration: 32m 55s) * 21:36 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Continuing with deployment * 21:33 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:16 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] * 20:59 dancy@deploy1003: Installation of scap version "4.262.1" completed for 2 hosts * 20:57 dancy@deploy1003: Installing scap version "4.262.1" for 2 host(s) * 20:57 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] (duration: 10m 59s) * 20:52 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Continuing with deployment * 20:48 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be ve * 20:46 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] * 20:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie * 20:22 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] (duration: 10m 30s) * 20:20 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1002.eqiad.wmnet with OS trixie * 20:18 arlolra@deploy1003: aaron, neriah, arlolra: Continuing with deployment * 20:14 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage * 20:13 arlolra@deploy1003: aaron, neriah, arlolra: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:12 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] * 20:10 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 20:09 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 20:09 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 20:09 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 20:09 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 20:09 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 20:07 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage * 20:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage * 19:57 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage * 19:55 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'. * 19:55 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'. * 19:54 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'. * 19:54 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'. * 19:45 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie * 19:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1002 * 19:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1002 * 19:41 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 19:41 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 19:40 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 19:40 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 19:39 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1002 * 19:39 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 19:39 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 19:39 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 19:39 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003" * 19:38 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003" * 19:32 herron@cumin1003: START - Cookbook sre.dns.netbox * 19:31 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 19:31 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 19:31 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 19:31 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 19:31 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 19:30 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 19:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1002 * 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1002.eqiad.wmnet with OS trixie * 19:17 dancy@deploy1003: Installation of scap version "4.262.0" completed for 2 hosts * 19:15 dancy@deploy1003: Installing scap version "4.262.0" for 2 host(s) * 19:15 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: rebooting firewall in desperation * 19:14 brennen@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] * 19:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]" * 19:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]" * 19:04 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie * 19:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] (duration: 10m 59s) * 18:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 18:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] * 18:49 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie * 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002" * 18:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002" * 18:47 brennen@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] (duration: 36m 04s) * 18:44 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie * 18:44 pt1979@cumin2002: START - Cookbook sre.dns.netbox * 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002" * 18:44 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002" * 18:40 pt1979@cumin2002: START - Cookbook sre.dns.netbox * 18:30 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie * 18:25 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie * 18:14 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-codfw * 18:13 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-codfw * 18:13 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie * 18:11 brennen@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] * 18:10 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-eqiad * 18:10 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-eqiad * 18:06 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): no current blockers, rolling to group0 * 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1003.eqiad.wmnet with OS trixie * 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage * 17:38 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage * 17:33 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'. * 17:32 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'. * 17:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'. * 17:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'. * 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1003 * 17:21 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1003 * 17:21 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1003 * 17:20 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 17:19 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 17:16 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 17:15 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 17:15 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 17:15 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003" * 17:15 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003" * 17:12 herron@cumin1003: START - Cookbook sre.dns.netbox * 17:09 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1003 * 17:08 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1003.eqiad.wmnet with OS trixie * 17:05 sukhe: sudo cumin -b11 "A:cp and not P<nowiki>{</nowiki>cp2041* or cp2042*<nowiki>}</nowiki> and not A:ulsfo" "run-puppet-agent --enable 'merging CR 1282979'" * 16:58 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] (duration: 07m 25s) * 16:53 sbassett@deploy1003: mstyles, sbassett: Continuing with deployment * 16:52 sbassett@deploy1003: mstyles, sbassett: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdeb * 16:50 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] * 16:38 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]] * 16:19 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync * 16:19 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync * 16:19 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync * 16:18 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync * 16:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] (duration: 06m 16s) * 16:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 16:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 16:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] * 16:01 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] (duration: 07m 53s) * 15:57 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 15:55 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync * 15:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:55 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync * 15:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync * 15:54 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync * 15:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] * 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] (duration: 06m 12s) * 15:48 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 15:47 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] * 15:42 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync * 15:42 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync * 15:39 dzahn@dns1005: END - running authdns-update * 15:38 mutante: deleting mwmaint.discovery.wmnet DNS entry - the hosts behind it dont exist anymore * 15:37 dzahn@dns1005: START - running authdns-update * 15:24 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply * 15:24 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply * 15:21 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply * 15:20 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply * 15:20 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply * 15:20 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply * 15:20 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] (duration: 06m 17s) * 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92340 and previous config saved to /var/cache/conftool/dbconfig/20260505-151930-fceratto.json * 15:16 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 15:16 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:14 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] * 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92339 and previous config saved to /var/cache/conftool/dbconfig/20260505-150921-fceratto.json * 15:08 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] (duration: 07m 06s) * 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 15:04 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 15:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:01 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] * 14:59 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] (duration: 07m 48s) * 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92338 and previous config saved to /var/cache/conftool/dbconfig/20260505-145913-fceratto.json * 14:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:57 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 14:55 urbanecm@deploy1003: urbanecm: Continuing with deployment * 14:53 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92337 and previous config saved to /var/cache/conftool/dbconfig/20260505-145231-fceratto.json * 14:51 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] * 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92336 and previous config saved to /var/cache/conftool/dbconfig/20260505-144905-fceratto.json * 14:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1004.eqiad.wmnet with OS trixie * 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92335 and previous config saved to /var/cache/conftool/dbconfig/20260505-144223-fceratto.json * 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92334 and previous config saved to /var/cache/conftool/dbconfig/20260505-144029-fceratto.json * 14:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance * 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92333 and previous config saved to /var/cache/conftool/dbconfig/20260505-143958-fceratto.json * 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92332 and previous config saved to /var/cache/conftool/dbconfig/20260505-143214-fceratto.json * 14:30 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=eqiad * 14:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92331 and previous config saved to /var/cache/conftool/dbconfig/20260505-142949-fceratto.json * 14:28 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage * 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master1001.eqiad.wmnet * 14:25 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage * 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master1001.eqiad.wmnet * 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92329 and previous config saved to /var/cache/conftool/dbconfig/20260505-142206-fceratto.json * 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92328 and previous config saved to /var/cache/conftool/dbconfig/20260505-141941-fceratto.json * 14:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet * 14:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1004 * 14:10 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1004 * 14:10 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS trixie * 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1015.eqiad.wmnet * 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 14:09 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92327 and previous config saved to /var/cache/conftool/dbconfig/20260505-140933-fceratto.json * 14:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet * 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 14:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 14:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 14:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 14:05 eevans@cumin1003: START - Cookbook sre.dns.netbox * 14:05 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=eqiad * 14:05 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=codfw * 14:04 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync * 14:04 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync * 14:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 14:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 14:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 14:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 14:03 Lucas_WMDE: UTC afternoon backport+config window done * 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master2001.codfw.wmnet * 14:02 jasmine@cumin2002: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution. * 14:01 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1015.eqiad.wmnet * 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1014.eqiad.wmnet * 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 14:01 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92326 and previous config saved to /var/cache/conftool/dbconfig/20260505-140047-fceratto.json * 14:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance * 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92325 and previous config saved to /var/cache/conftool/dbconfig/20260505-140016-fceratto.json * 13:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: Repooling * 13:59 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master2001.codfw.wmnet * 13:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 13:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 13:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 13:55 eevans@cumin1003: START - Cookbook sre.dns.netbox * 13:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] (duration: 06m 22s) * 13:50 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1014.eqiad.wmnet * 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92323 and previous config saved to /var/cache/conftool/dbconfig/20260505-135008-fceratto.json * 13:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 13:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:49 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=codfw * 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1011.eqiad.wmnet * 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 13:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] * 13:47 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92321 and previous config saved to /var/cache/conftool/dbconfig/20260505-134522-fceratto.json * 13:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance * 13:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1227: Repooling * 13:44 eevans@cumin1003: START - Cookbook sre.dns.netbox * 13:43 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution. * 13:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92319 and previous config saved to /var/cache/conftool/dbconfig/20260505-134257-fceratto.json * 13:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance * 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92318 and previous config saved to /var/cache/conftool/dbconfig/20260505-134000-fceratto.json * 13:37 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1011.eqiad.wmnet * 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1010.eqiad.wmnet * 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 13:37 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 13:33 eevans@cumin1003: START - Cookbook sre.dns.netbox * 13:30 Msz2001: UTC afternoon backport window done * 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92317 and previous config saved to /var/cache/conftool/dbconfig/20260505-132952-fceratto.json * 13:27 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1010.eqiad.wmnet * 13:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 13:23 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 13:23 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] (duration: 08m 37s) * 13:23 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 13:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 13:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on dborch1002.wikimedia.org with reason: [[phab:T416582|T416582]] * 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92316 and previous config saved to /var/cache/conftool/dbconfig/20260505-132002-fceratto.json * 13:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance * 13:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92315 and previous config saved to /var/cache/conftool/dbconfig/20260505-131931-fceratto.json * 13:19 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Continuing with deployment * 13:16 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug * 13:15 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] * 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet * 13:11 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] (duration: 07m 55s) * 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet * 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92314 and previous config saved to /var/cache/conftool/dbconfig/20260505-130923-fceratto.json * 13:07 mszwarc@deploy1003: mszwarc: Continuing with deployment * 13:05 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] * 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92313 and previous config saved to /var/cache/conftool/dbconfig/20260505-125915-fceratto.json * 12:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] (duration: 07m 23s) * 12:52 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 12:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:49 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] * 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92312 and previous config saved to /var/cache/conftool/dbconfig/20260505-124907-fceratto.json * 12:44 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] (duration: 03m 56s) * 12:43 sgimeno@deploy1003: sgimeno: Continuing with deployment * 12:42 moritzm: installing node-tar security updates * 12:41 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:40 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] * 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92311 and previous config saved to /var/cache/conftool/dbconfig/20260505-124041-fceratto.json * 12:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance * 12:36 moritzm: installing imagemagick security updates * 12:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2239.codfw.wmnet with reason: Maintenance * 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92310 and previous config saved to /var/cache/conftool/dbconfig/20260505-123411-fceratto.json * 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 12:31 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply * 12:29 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply * 12:28 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply * 12:26 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply * 12:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92309 and previous config saved to /var/cache/conftool/dbconfig/20260505-122404-fceratto.json * 12:23 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply * 12:23 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply * 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92308 and previous config saved to /var/cache/conftool/dbconfig/20260505-121352-fceratto.json * 12:04 moritzm: installing postgresql-13 security updates * 12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92307 and previous config saved to /var/cache/conftool/dbconfig/20260505-120344-fceratto.json * 11:57 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] (duration: 06m 13s) * 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2005.codfw.wmnet * 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92306 and previous config saved to /var/cache/conftool/dbconfig/20260505-115535-fceratto.json * 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance * 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92305 and previous config saved to /var/cache/conftool/dbconfig/20260505-115503-fceratto.json * 11:53 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 11:53 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 11:52 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2005.codfw.wmnet * 11:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] * 11:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] (duration: 09m 21s) * 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2004.codfw.wmnet * 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92304 and previous config saved to /var/cache/conftool/dbconfig/20260505-114455-fceratto.json * 11:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 11:43 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2004.codfw.wmnet * 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2003.codfw.wmnet * 11:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 11:38 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2003.codfw.wmnet * 11:38 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] * 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92303 and previous config saved to /var/cache/conftool/dbconfig/20260505-113446-fceratto.json * 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92302 and previous config saved to /var/cache/conftool/dbconfig/20260505-112449-fceratto.json * 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92301 and previous config saved to /var/cache/conftool/dbconfig/20260505-112438-fceratto.json * 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92300 and previous config saved to /var/cache/conftool/dbconfig/20260505-111616-fceratto.json * 11:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance * 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92299 and previous config saved to /var/cache/conftool/dbconfig/20260505-111545-fceratto.json * 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92298 and previous config saved to /var/cache/conftool/dbconfig/20260505-111435-fceratto.json * 11:10 moritzm: installing ca-certificates updates from bookworm point release * 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: after reimage to trixie * 11:07 moritzm: installing multipart bugfix updates from bookworm point release * 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92296 and previous config saved to /var/cache/conftool/dbconfig/20260505-110537-fceratto.json * 11:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica * 11:05 ayounsi@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica * 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92295 and previous config saved to /var/cache/conftool/dbconfig/20260505-110427-fceratto.json * 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1174: after reimage to trixie * 10:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92293 and previous config saved to /var/cache/conftool/dbconfig/20260505-105529-fceratto.json * 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92291 and previous config saved to /var/cache/conftool/dbconfig/20260505-105419-fceratto.json * 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'. * 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'. * 10:50 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'. * 10:50 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'. * 10:49 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'. * 10:49 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'. * 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92290 and previous config saved to /var/cache/conftool/dbconfig/20260505-104521-fceratto.json * 10:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92288 and previous config saved to /var/cache/conftool/dbconfig/20260505-104032-fceratto.json * 10:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance * 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92286 and previous config saved to /var/cache/conftool/dbconfig/20260505-103702-fceratto.json * 10:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance * 10:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92285 and previous config saved to /var/cache/conftool/dbconfig/20260505-103632-fceratto.json * 10:32 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 10:29 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92283 and previous config saved to /var/cache/conftool/dbconfig/20260505-102623-fceratto.json * 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2221: after reimage to trixie * 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 10:23 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . * 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 10:22 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . * 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2221.codfw.wmnet with OS trixie * 10:17 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . * 10:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92281 and previous config saved to /var/cache/conftool/dbconfig/20260505-101616-fceratto.json * 10:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1174: after reimage to trixie * 09:42 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . * 09:41 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 09:39 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 09:38 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P92271 and previous config saved to /var/cache/conftool/dbconfig/20260505-093703-fceratto.json * 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92270 and previous config saved to /var/cache/conftool/dbconfig/20260505-093619-fceratto.json * 09:36 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 09:35 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . * 09:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92269 and previous config saved to /var/cache/conftool/dbconfig/20260505-093305-fceratto.json * 09:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance * 09:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance * 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1174.eqiad.wmnet with OS trixie * 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2221.codfw.wmnet with OS trixie * 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1174: Reimage to Trixie * 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: Reimage to Trixie * 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1174: Reimage to Trixie * 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2221: Reimage to Trixie * 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Reimage to Trixie * 09:28 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . * 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Reimage to Trixie * 09:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92265 and previous config saved to /var/cache/conftool/dbconfig/20260505-092654-fceratto.json * 09:26 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 09:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92264 and previous config saved to /var/cache/conftool/dbconfig/20260505-092431-fceratto.json * 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92263 and previous config saved to /var/cache/conftool/dbconfig/20260505-091808-fceratto.json * 09:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance * 09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92262 and previous config saved to /var/cache/conftool/dbconfig/20260505-091423-fceratto.json * 09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance * 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92260 and previous config saved to /var/cache/conftool/dbconfig/20260505-091254-fceratto.json * 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92259 and previous config saved to /var/cache/conftool/dbconfig/20260505-090415-fceratto.json * 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92258 and previous config saved to /var/cache/conftool/dbconfig/20260505-090246-fceratto.json * 08:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2209: after reimage to trixie * 08:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92256 and previous config saved to /var/cache/conftool/dbconfig/20260505-085407-fceratto.json * 08:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2209.codfw.wmnet with OS trixie * 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92255 and previous config saved to /var/cache/conftool/dbconfig/20260505-085238-fceratto.json * 08:50 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org * 08:50 moritzm: installing augeas security updates * 08:49 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors * 08:48 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002" * 08:48 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002" * 08:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92254 and previous config saved to /var/cache/conftool/dbconfig/20260505-084616-fceratto.json * 08:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance * 08:42 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . * 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92253 and previous config saved to /var/cache/conftool/dbconfig/20260505-084231-fceratto.json * 08:41 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . * 08:40 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . * 08:38 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . * 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox * 08:37 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 08:37 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . * 08:35 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . * 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 08:34 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 13 hosts with reason: switches replacement * 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92252 and previous config saved to /var/cache/conftool/dbconfig/20260505-083356-fceratto.json * 08:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance * 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92251 and previous config saved to /var/cache/conftool/dbconfig/20260505-083326-fceratto.json * 08:32 jmm@cumin2002: START - Cookbook sre.dns.netbox * 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 08:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2209.codfw.wmnet with reason: host reimage * 08:29 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors * 08:28 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 08:28 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 08:24 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . * 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92250 and previous config saved to /var/cache/conftool/dbconfig/20260505-082318-fceratto.json * 08:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: after reimage to trixie * 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2209.codfw.wmnet with reason: host reimage * 08:16 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/x-flac # [[phab:T414641|T414641]] * 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: after reimage to trixie * 08:14 jmm@cumin2002: START - Cookbook sre.dns.netbox * 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org * 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92247 and previous config saved to /var/cache/conftool/dbconfig/20260505-081309-fceratto.json * 08:08 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/flac # [[phab:T414641|T414641]] * 08:05 ayounsi@dns1004: END - running authdns-update * 08:03 ayounsi@dns1004: START - running authdns-update * 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92245 and previous config saved to /var/cache/conftool/dbconfig/20260505-080301-fceratto.json * 08:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2209.codfw.wmnet with OS trixie * 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003" * 08:01 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003" * 08:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2209: Reimage to Trixie * 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2209: Reimage to Trixie * 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2209.codfw.wmnet with reason: Reimage to Trixie * 07:58 ayounsi@cumin1003: START - Cookbook sre.dns.netbox * 07:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2209 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92243 and previous config saved to /var/cache/conftool/dbconfig/20260505-075746-marostegui.json * 07:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2205 to s3 primary [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92242 and previous config saved to /var/cache/conftool/dbconfig/20260505-075654-marostegui.json * 07:55 awight: EU morning deployment was fun * 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92241 and previous config saved to /var/cache/conftool/dbconfig/20260505-075416-fceratto.json * 07:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance * 07:52 marostegui: Starting s3 codfw failover from db2209 to db2205 - [[phab:T424864|T424864]] * 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2205 with weight 0 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92239 and previous config saved to /var/cache/conftool/dbconfig/20260505-075156-marostegui.json * 07:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T424864|T424864]] * 07:50 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --force --mediatype AUDIO --mime audio/midi # [[phab:T414645|T414645]] * 07:45 zabe: zabe@deploy1003:~$ mwscript namespaceDupes.php scnwiki --fix # [[phab:T425378|T425378]] * 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2222: after reimage to trixie * 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2222.codfw.wmnet with OS trixie * 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1170: after reimage to trixie * 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS trixie * 07:11 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] (duration: 06m 43s) * 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2222.codfw.wmnet with reason: host reimage * 07:07 awight@deploy1003: awight, 1f616emo: Continuing with deployment * 07:06 awight@deploy1003: awight, 1f616emo: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:05 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] * 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage * 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org * 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2222.codfw.wmnet with reason: host reimage * 07:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1156: after reimage to trixie * 06:58 jmm@cumin2002: START - Cookbook sre.dns.netbox * 06:58 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org * 06:58 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage * 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS trixie * 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2222.codfw.wmnet with OS trixie * 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Reimage to Trixie * 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: Reimage to Trixie * 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Reimage to Trixie * 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Reimage to Trixie * 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2222: Reimage to Trixie * 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Reimage to Trixie * 06:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1156: after reimage to trixie * 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1156.eqiad.wmnet with OS trixie * 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage * 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage * 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003" * 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003 * 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003 * 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003" * 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1156.eqiad.wmnet with OS trixie * 05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1156: Reimage to Trixie * 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1156: Reimage to Trixie * 05:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1156.eqiad.wmnet with reason: Reimage to Trixie * 05:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie * 04:03 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.23 (duration: 03m 12s) * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002" * 01:21 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002" * 01:16 pt1979@cumin2002: START - Cookbook sre.dns.netbox * 00:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] (duration: 06m 50s) * 00:11 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 00:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] == 2026-05-04 == * 23:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]] * 23:14 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] (duration: 06m 45s) * 23:10 ladsgroup@deploy1003: neriah, ladsgroup: Continuing with deployment * 23:09 ladsgroup@deploy1003: neriah, ladsgroup: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] * 22:08 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 21:43 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 21:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 21:32 cwhite@deploy1003: Finished deploy [statsv/statsv@152de49]: fix logging (duration: 00m 11s) * 21:32 cwhite@deploy1003: Started deploy [statsv/statsv@152de49]: fix logging * 21:20 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] (duration: 11m 20s) * 21:16 cjming@deploy1003: cjming, neriah: Continuing with deployment * 21:10 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:09 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] * 20:38 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] (duration: 22m 19s) * 20:34 cjming@deploy1003: mmartorana, cjming: Continuing with deployment * 20:18 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:16 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] * 20:11 toyofuku@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] (duration: 07m 21s) * 20:07 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1005.eqiad.wmnet with OS trixie * 20:06 toyofuku@deploy1003: toyofuku: Continuing with deployment * 20:05 toyofuku@deploy1003: toyofuku: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:03 toyofuku@deploy1003: Started scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] * 19:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) asw1-22-ulsfo.wikimedia.org on all recursors * 19:50 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache asw1-22-ulsfo.wikimedia.org on all recursors * 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003" * 19:49 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003" * 19:48 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage * 19:44 ayounsi@cumin1003: START - Cookbook sre.dns.netbox * 19:42 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage * 19:40 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]] * 19:37 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]] * 19:28 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: ongoing troubleshooting * 19:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1005 * 19:27 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1005 * 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1005.eqiad.wmnet with OS trixie * 19:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'. * 19:23 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]] * 19:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'. * 19:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]] * 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 18:59 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] (duration: 06m 16s) * 18:55 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 18:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] * 18:31 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] (duration: 09m 17s) * 18:27 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 18:23 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 18:22 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] * 18:11 dancy@deploy1003: Finished scap sync-world: testing (duration: 02m 04s) * 18:11 dancy@deploy1003: dancy: Rolling back deployment * 18:10 dancy@deploy1003: dancy: testing synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 18:09 dancy@deploy1003: Started scap sync-world: testing * 18:08 dancy@deploy1003: Installation of scap version "4.260.0" completed for 2 hosts * 18:06 dancy@deploy1003: Installing scap version "4.260.0" for 2 host(s) * 17:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 16:40 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 16:39 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 16:34 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply * 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 16:33 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply * 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 16:04 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] (duration: 06m 19s) * 16:00 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 16:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] * 15:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92224 and previous config saved to /var/cache/conftool/dbconfig/20260504-155514-fceratto.json * 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92223 and previous config saved to /var/cache/conftool/dbconfig/20260504-154506-fceratto.json * 15:38 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] (duration: 06m 59s) * 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92222 and previous config saved to /var/cache/conftool/dbconfig/20260504-153458-fceratto.json * 15:34 ladsgroup@deploy1003: ladsgroup, chlod: Continuing with deployment * 15:33 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 39 hosts with reason: switches replacement * 15:33 ladsgroup@deploy1003: ladsgroup, chlod: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:32 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync * 15:32 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync * 15:31 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] * 15:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92221 and previous config saved to /var/cache/conftool/dbconfig/20260504-152449-fceratto.json * 15:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92220 and previous config saved to /var/cache/conftool/dbconfig/20260504-152238-fceratto.json * 15:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance * 15:20 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:17 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'. * 15:17 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'. * 15:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:13 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org * 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors * 15:13 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 15:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92219 and previous config saved to /var/cache/conftool/dbconfig/20260504-151238-fceratto.json * 15:10 papaul: ongoing switch refresh in ULSFO * 15:10 jmm@cumin2002: START - Cookbook sre.dns.netbox * 15:10 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 15:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] (duration: 06m 45s) * 15:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92218 and previous config saved to /var/cache/conftool/dbconfig/20260504-150230-fceratto.json * 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 15:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] * 14:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2001.codfw.wmnet with OS trixie * 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92217 and previous config saved to /var/cache/conftool/dbconfig/20260504-145222-fceratto.json * 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92216 and previous config saved to /var/cache/conftool/dbconfig/20260504-144213-fceratto.json * 14:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 7 hosts * 14:41 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for 7 hosts * 14:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage * 14:34 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage * 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92215 and previous config saved to /var/cache/conftool/dbconfig/20260504-143334-fceratto.json * 14:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance * 14:30 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cr[3-4]-ulsfo IPv6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPv6 with reason: switch refresh * 14:28 pt1979@cumin1003: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 4:00:00 on cr[3-4]-ulsfo IPV6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPV6 with reason: switch refresh * 14:25 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on asw2-ulsfo,cr[3-4]-ulsfo,mr1-ulsfo with reason: switch refresh * 14:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2001 * 14:16 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2001 * 14:13 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2001 * 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 14:13 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:13 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003" * 14:13 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003" * 14:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92214 and previous config saved to /var/cache/conftool/dbconfig/20260504-141113-fceratto.json * 14:07 herron@cumin1003: START - Cookbook sre.dns.netbox * 14:04 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2001 * 14:04 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2001.codfw.wmnet with OS trixie * 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92213 and previous config saved to /var/cache/conftool/dbconfig/20260504-140105-fceratto.json * 14:00 slyngshede@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]] * 14:00 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]] * 14:00 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]] * 13:59 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] (duration: 06m 22s) * 13:57 jmm@cumin2002: START - Cookbook sre.dns.netbox * 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors * 13:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 13:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 13:55 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply * 13:55 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply * 13:55 sbisson@deploy1003: sbisson: Continuing with deployment * 13:55 sbisson@deploy1003: sbisson: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:54 dcausse: [[phab:T425301|T425301]]: stopping writes again on cloudelastic, cluster unstable * 13:53 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] * 13:52 jmm@cumin2002: START - Cookbook sre.dns.netbox * 13:52 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org * 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92212 and previous config saved to /var/cache/conftool/dbconfig/20260504-135056-fceratto.json * 13:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] (duration: 07m 30s) * 13:46 sbisson@deploy1003: 1f616emo, sbisson: Continuing with deployment * 13:45 sbisson@deploy1003: 1f616emo, sbisson: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] * 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92211 and previous config saved to /var/cache/conftool/dbconfig/20260504-134048-fceratto.json * 13:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92210 and previous config saved to /var/cache/conftool/dbconfig/20260504-133039-fceratto.json * 13:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance * 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92209 and previous config saved to /var/cache/conftool/dbconfig/20260504-133010-fceratto.json * 13:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:23 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92208 and previous config saved to /var/cache/conftool/dbconfig/20260504-132002-fceratto.json * 13:13 moritzm: installing jaraco.context security updates * 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5004.eqsin.wmnet * 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5004.eqsin.wmnet with OS bookworm * 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92207 and previous config saved to /var/cache/conftool/dbconfig/20260504-130953-fceratto.json * 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92206 and previous config saved to /var/cache/conftool/dbconfig/20260504-125945-fceratto.json * 12:59 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply * 12:59 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply * 12:59 dcausse: [[phab:T425301|T425301]]: resuming writes on cloudelastic * 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92205 and previous config saved to /var/cache/conftool/dbconfig/20260504-125247-fceratto.json * 12:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance * 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92204 and previous config saved to /var/cache/conftool/dbconfig/20260504-125219-fceratto.json * 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance * 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage * 12:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage * 12:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92203 and previous config saved to /var/cache/conftool/dbconfig/20260504-124210-fceratto.json * 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92202 and previous config saved to /var/cache/conftool/dbconfig/20260504-123203-fceratto.json * 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92201 and previous config saved to /var/cache/conftool/dbconfig/20260504-122155-fceratto.json * 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92200 and previous config saved to /var/cache/conftool/dbconfig/20260504-121441-fceratto.json * 12:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance * 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92199 and previous config saved to /var/cache/conftool/dbconfig/20260504-121424-fceratto.json * 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92198 and previous config saved to /var/cache/conftool/dbconfig/20260504-120416-fceratto.json * 12:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5004.eqsin.wmnet with OS bookworm * 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002" * 11:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002" * 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5004.eqsin.wmnet on all recursors * 11:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5004.eqsin.wmnet on all recursors * 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002" * 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92197 and previous config saved to /var/cache/conftool/dbconfig/20260504-115408-fceratto.json * 11:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002" * 11:47 jmm@cumin2002: START - Cookbook sre.dns.netbox * 11:47 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5004.eqsin.wmnet * 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5003.eqsin.wmnet * 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5003.eqsin.wmnet with OS bookworm * 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92196 and previous config saved to /var/cache/conftool/dbconfig/20260504-114400-fceratto.json * 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92195 and previous config saved to /var/cache/conftool/dbconfig/20260504-113620-fceratto.json * 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance * 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92194 and previous config saved to /var/cache/conftool/dbconfig/20260504-113550-fceratto.json * 11:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: after reimage to trixie * 11:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage * 11:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage * 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92192 and previous config saved to /var/cache/conftool/dbconfig/20260504-112542-fceratto.json * 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92191 and previous config saved to /var/cache/conftool/dbconfig/20260504-111534-fceratto.json * 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92189 and previous config saved to /var/cache/conftool/dbconfig/20260504-110526-fceratto.json * 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: repool after maintenance * 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92187 and previous config saved to /var/cache/conftool/dbconfig/20260504-105808-fceratto.json * 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance * 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92186 and previous config saved to /var/cache/conftool/dbconfig/20260504-105739-fceratto.json * 10:48 moritzm: installing bash updates from trixie point release * 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92184 and previous config saved to /var/cache/conftool/dbconfig/20260504-104731-fceratto.json * 10:42 moritzm: installing postgresql-17 security updates * 10:42 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1162: after reimage to trixie * 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS trixie * 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS bookworm * 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92181 and previous config saved to /var/cache/conftool/dbconfig/20260504-103723-fceratto.json * 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002" * 10:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002" * 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5003.eqsin.wmnet on all recursors * 10:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5003.eqsin.wmnet on all recursors * 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002" * 10:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002" * 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92179 and previous config saved to /var/cache/conftool/dbconfig/20260504-102715-fceratto.json * 10:26 jmm@cumin2002: START - Cookbook sre.dns.netbox * 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5003.eqsin.wmnet * 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92178 and previous config saved to /var/cache/conftool/dbconfig/20260504-101855-fceratto.json * 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance * 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92177 and previous config saved to /var/cache/conftool/dbconfig/20260504-101826-fceratto.json * 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2187: repool after maintenance * 10:16 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage * 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage * 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92174 and previous config saved to /var/cache/conftool/dbconfig/20260504-100818-fceratto.json * 10:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS trixie * 10:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reimage to Trixie * 10:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reimage to Trixie * 10:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1162.eqiad.wmnet with reason: Reimage to Trixie * 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92172 and previous config saved to /var/cache/conftool/dbconfig/20260504-095810-fceratto.json * 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org * 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92171 and previous config saved to /var/cache/conftool/dbconfig/20260504-094802-fceratto.json * 09:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org * 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92170 and previous config saved to /var/cache/conftool/dbconfig/20260504-093938-fceratto.json * 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance * 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92169 and previous config saved to /var/cache/conftool/dbconfig/20260504-093910-fceratto.json * 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 09:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1182: after reimage to trixie * 09:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92167 and previous config saved to /var/cache/conftool/dbconfig/20260504-092902-fceratto.json * 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92165 and previous config saved to /var/cache/conftool/dbconfig/20260504-091853-fceratto.json * 09:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Fixing events * 09:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Fixing events * 09:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2187.codfw.wmnet with reason: Checking events * 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92163 and previous config saved to /var/cache/conftool/dbconfig/20260504-090845-fceratto.json * 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92161 and previous config saved to /var/cache/conftool/dbconfig/20260504-085930-fceratto.json * 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance * 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92160 and previous config saved to /var/cache/conftool/dbconfig/20260504-085912-fceratto.json * 08:56 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1182: after reimage to trixie * 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92158 and previous config saved to /var/cache/conftool/dbconfig/20260504-084904-fceratto.json * 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1008.eqiad.wmnet * 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1007.eqiad.wmnet * 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1006.eqiad.wmnet * 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1005.eqiad.wmnet * 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1004.eqiad.wmnet * 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1003.eqiad.wmnet * 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1002.eqiad.wmnet * 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1001.eqiad.wmnet * 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1002.eqiad.wmnet * 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1001.eqiad.wmnet * 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92157 and previous config saved to /var/cache/conftool/dbconfig/20260504-083857-fceratto.json * 08:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS trixie * 08:32 moritzm: installing Linux 5.10.251-3 on bullseye hosts * 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92156 and previous config saved to /var/cache/conftool/dbconfig/20260504-082849-fceratto.json * 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet * 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92155 and previous config saved to /var/cache/conftool/dbconfig/20260504-082024-fceratto.json * 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance * 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance * 08:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet * 08:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage * 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage * 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply * 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply * 08:04 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync * 08:04 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] (duration: 07m 58s) * 08:03 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync * 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 08:02 gkyziridis@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: sync * 08:02 gkyziridis@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: sync * 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet * 08:01 moritzm: installing Linux 6.1.170 on bookworm hosts * 07:59 urbanecm@deploy1003: urbanecm, h2o: Continuing with deployment * 07:57 urbanecm@deploy1003: urbanecm, h2o: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1212: after reimage to trixie * 07:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] * 07:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet * 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS trixie * 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 07:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1182: Reimage to Trixie * 07:47 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1182: Reimage to Trixie * 07:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1182.eqiad.wmnet with reason: Reimage to Trixie * 07:44 dcausse: [[phab:T425301|T425301]]: stopping writes on cloudelastic * 07:44 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply * 07:44 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply * 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2147.codfw.wmnet * 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 07:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2149: after reimage to trixie * 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1188: after reimage to trixie * 07:38 moritzm: installing Linux 6.12.85 on trixie hosts * 07:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo2003.codfw.wmnet * 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 07:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo2003.codfw.wmnet * 07:33 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 07:28 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2147.codfw.wmnet * 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org * 07:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org * 07:11 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1212: after reimage to trixie * 07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1212.eqiad.wmnet with OS trixie * 06:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2149: after reimage to trixie * 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1188: after reimage to trixie * 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1188.eqiad.wmnet with OS trixie * 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2149.codfw.wmnet with OS trixie * 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage * 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage * 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage * 06:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2149.codfw.wmnet with reason: host reimage * 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1212.eqiad.wmnet with OS trixie * 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage * 06:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2149.codfw.wmnet with reason: host reimage * 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1212: Reimage to Trixie * 06:11 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1212: Reimage to Trixie * 06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1212.eqiad.wmnet with reason: Reimage to Trixie * 06:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s3 master: reimage to Debian Trixie * 06:09 marostegui: Reimage sanitarium master for s3, lag to be expected on wikireplicas for s3 [[phab:T424792|T424792]] * 06:05 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1188.eqiad.wmnet with OS trixie * 06:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1188: Reimage to Trixie * 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1188: Reimage to Trixie * 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1188.eqiad.wmnet with reason: Reimage to Trixie * 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2149.codfw.wmnet with OS trixie * 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2149: Reimage to Trixie * 05:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2149: Reimage to Trixie * 05:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2149.codfw.wmnet with reason: Reimage to Trixie * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image == 2026-05-03 == * 14:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] (duration: 10m 51s) * 14:05 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 14:04 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] * 12:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]] (duration: 29m 22s) * 11:58 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]] == 2026-05-02 == * 23:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] (duration: 06m 41s) * 23:28 zabe@deploy1003: dreamyjazz, zabe: Continuing with deployment * 23:27 zabe@deploy1003: dreamyjazz, zabe: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] * 23:22 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] (duration: 07m 27s) * 23:18 zabe@deploy1003: zabe, dreamyjazz: Continuing with deployment * 23:17 zabe@deploy1003: zabe, dreamyjazz: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:15 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] * 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie * 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie * 18:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2369.codfw.wmnet with OS trixie * 17:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:36 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage * 17:14 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage * 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2374.codfw.wmnet with OS trixie * 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2373.codfw.wmnet with OS trixie * 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:09 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2372.codfw.wmnet with OS trixie * 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2371.codfw.wmnet with OS trixie * 17:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2370.codfw.wmnet with OS trixie * 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 16:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage * 16:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 16:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage * 16:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 16:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage * 16:44 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage * 16:43 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage * 16:42 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage * 16:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage * 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage * 16:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage * 16:31 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2374.codfw.wmnet with OS trixie * 16:30 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2373.codfw.wmnet with OS trixie * 16:29 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2372.codfw.wmnet with OS trixie * 16:28 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage * 16:20 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie * 16:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2371.codfw.wmnet with OS trixie * 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie * 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2370.codfw.wmnet with OS trixie * 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2369.codfw.wmnet with OS trixie * 16:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2366.codfw.wmnet with OS trixie * 16:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2367.codfw.wmnet with OS trixie * 15:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2368.codfw.wmnet with OS trixie * 15:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage * 15:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage * 15:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage * 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage * 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage * 15:36 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage * 15:25 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie * 15:24 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie * 15:23 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie * 12:02 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] (duration: 13m 06s) * 11:57 samtar@deploy1003: samtar: Continuing with deployment * 11:50 samtar@deploy1003: samtar: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 11:49 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] * 09:20 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply * 09:19 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply * 09:19 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply * 09:19 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply * 09:19 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply * 09:19 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply * 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2366.codfw.wmnet with OS trixie * 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2368.codfw.wmnet with OS trixie * 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2367.codfw.wmnet with OS trixie * 02:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 02:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 02:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 02:12 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 31s) * 02:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage * 01:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage * 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage * 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage * 01:49 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage * 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie * 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie * 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie * 01:35 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2364.codfw.wmnet with OS trixie * 01:34 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 01:29 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2365.codfw.wmnet with OS trixie * 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 01:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2363.codfw.wmnet with OS trixie * 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 01:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 01:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage * 01:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage * 01:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage * 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage * 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage * 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage * 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2365.codfw.wmnet with OS trixie * 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2364.codfw.wmnet with OS trixie * 00:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2363.codfw.wmnet with OS trixie * 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2362.codfw.wmnet with OS trixie * 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 00:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2361.codfw.wmnet with OS trixie * 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 00:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2360.codfw.wmnet with OS trixie * 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 00:01 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" == 2026-05-01 == * 23:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage * 23:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage * 23:45 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage * 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage * 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage * 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage * 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2362.codfw.wmnet with OS trixie * 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2361.codfw.wmnet with OS trixie * 23:26 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2360.codfw.wmnet with OS trixie * 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2357.codfw.wmnet with OS trixie * 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 23:25 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2359.codfw.wmnet with OS trixie * 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 23:22 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2358.codfw.wmnet with OS trixie * 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 23:18 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 23:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage * 23:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage * 23:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage * 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage * 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage * 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage * 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2359.codfw.wmnet with OS trixie * 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2358.codfw.wmnet with OS trixie * 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2357.codfw.wmnet with OS trixie * 22:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:19 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:15 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:02 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:57 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:50 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:49 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:46 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:21 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2374 * 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2374 * 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2373 * 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2373 * 20:59 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2372 * 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2372 * 20:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2371 * 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2371 * 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2370 * 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2370 * 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2369 * 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2369 * 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2368 * 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2368 * 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2367 * 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2367 * 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2366 * 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2366 * 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2365 * 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2365 * 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2364 * 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2364 * 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2363 * 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2363 * 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2362 * 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2362 * 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2361 * 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2361 * 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2360 * 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2360 * 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2359 * 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2359 * 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2358 * 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2358 * 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2357 * 20:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2357 * 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002" * 20:53 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002" * 20:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox * 20:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2002.codfw.wmnet with OS trixie * 20:06 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] (duration: 15m 27s) * 20:02 krinkle@deploy1003: krinkle: Continuing with deployment * 19:54 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage * 19:52 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 19:51 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] * 19:49 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage * 19:40 dancy@deploy1003: Finished scap sync-world: testing [[phab:T317405|T317405]] (duration: 03m 23s) * 19:37 dancy@deploy1003: Started scap sync-world: testing [[phab:T317405|T317405]] * 19:36 dancy@deploy1003: Installation of scap version "4.259.0" completed for 2 hosts * 19:34 dancy@deploy1003: Installing scap version "4.259.0" for 2 host(s) * 18:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'. * 18:55 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'. * 18:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Alangi Derick out of all services on: 2442 hosts * 18:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2002 * 18:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2002 * 18:41 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2002 * 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 18:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 18:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003" * 18:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003" * 18:36 herron@cumin1003: START - Cookbook sre.dns.netbox * 18:33 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2002 * 18:32 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2002.codfw.wmnet with OS trixie * 18:26 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2003.codfw.wmnet with OS trixie * 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage * 18:00 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage * 17:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2003 * 17:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2003 * 17:40 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2003 * 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 17:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 17:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003" * 17:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003" * 17:33 herron@cumin1003: START - Cookbook sre.dns.netbox * 17:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2003 * 17:28 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2003.codfw.wmnet with OS trixie * 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2004.codfw.wmnet with OS trixie * 16:34 cdobbins@cumin2002: conftool action : get/pooled; selector: name=cp5024.eqsin.wmnet * 16:30 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply * 16:30 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply * 16:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet * 16:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet * 16:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet * 15:59 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 15:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet * 15:47 dancy@deploy1003: Installation of scap version "4.258.1" completed for 2 hosts * 15:45 dancy@deploy1003: Installing scap version "4.258.1" for 2 host(s) * 15:34 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage * 15:30 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage * 15:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2004 * 15:14 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2004 * 15:11 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2004 * 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 15:11 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 15:11 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003" * 15:11 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003" * 15:05 dancy@deploy1003: Installation of scap version "4.258.0" completed for 2 hosts * 15:03 dancy@deploy1003: Installing scap version "4.258.0" for 2 host(s) * 14:57 herron@cumin1003: START - Cookbook sre.dns.netbox * 14:47 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2004 * 14:47 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2004.codfw.wmnet with OS trixie * 13:45 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply * 13:44 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply * 13:24 _Gerges: WikiMonitor setup * 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080 * 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078 * 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1079 * 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1077 * 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080 * 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079 * 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078 * 13:08 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077 * 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003" * 13:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003" * 13:00 jclark@cumin1003: START - Cookbook sre.dns.netbox * 12:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 12:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 12:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 12:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 09:57 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] (duration: 06m 49s) * 09:53 samtar@deploy1003: samtar: Continuing with deployment * 09:52 samtar@deploy1003: samtar: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 09:50 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] * 09:38 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]] (duration: 06m 05s) * 09:32 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]] * 08:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply * 08:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply * 08:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 08:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 08:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 08:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 00:16 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] (duration: 07m 05s) * 00:13 zabe@deploy1003: zabe: Continuing with deployment * 00:11 zabe@deploy1003: zabe: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:09 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] == Other archives == See [[Server Admin Log/Archives]]. <noinclude> [[Category:SAL]] [[Category:Operations]] </noinclude> eae6j0h60l3d4e2a7j38skr1p0kwlao 2418626 2418625 2026-05-24T02:06:56Z Stashbot 7414 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s) 2418626 wikitext text/x-wiki == 2026-05-24 == * 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image == 2026-05-23 == * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s) * 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image == 2026-05-22 == * 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002 * 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002 * 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002 * 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002 * 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]] * 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]] * 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 17:34 topranks: enable ttl protection on esams CRs IBGP session * 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session * 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet * 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002" * 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002" * 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet * 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox * 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet * 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet * 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002" * 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002" * 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet * 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox * 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet * 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet * 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply * 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply * 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet * 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet * 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet * 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts * 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]] * 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts * 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp * 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet * 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet * 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia * 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet * 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp * 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp * 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet * 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . * 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . * 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet * 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply * 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply * 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C * 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp * 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp * 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet * 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed * 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet * 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C * 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017 * 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed * 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet * 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp * 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet * 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet * 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C * 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C * 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet * 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply * 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply * 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed * 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet * 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]] * 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie * 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie * 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet * 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed * 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet * 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage * 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie * 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet * 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage * 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp * 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp * 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet * 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage * 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie * 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage * 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage * 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage * 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet * 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet * 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie * 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet * 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet * 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie * 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie * 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet * 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie * 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp * 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp * 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet * 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie * 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply * 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply * 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet * 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp * 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors * 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003" * 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors * 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003" * 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp * 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet * 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply * 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply * 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet * 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet * 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp * 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A * 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp * 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet * 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A * 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet * 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057 * 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057 * 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp * 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp * 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet * 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet * 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp * 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet * 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet * 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org * 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org * 05:25 marostegui@dns1004: END - running authdns-update * 05:24 marostegui@dns1004: START - running authdns-update * 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]] * 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot * 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot * 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet * 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet == 2026-05-21 == * 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s) * 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment * 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified * 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] * 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie * 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage * 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage * 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply * 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply * 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie * 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase * 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 18:53 papaul: rebooting msw1-codfw * 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply * 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply * 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply * 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply * 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply * 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply * 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply * 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply * 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply * 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply * 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply * 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply * 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply * 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply * 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply * 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply * 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply * 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply * 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply * 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply * 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply * 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply * 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply * 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028 * 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply * 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down * 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028 * 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply * 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply * 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply * 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply * 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet * 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet * 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply * 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply * 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply * 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply * 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply * 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply * 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply * 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json * 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply * 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply * 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply * 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply * 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply * 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029 * 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031 * 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031 * 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029 * 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028 * 16:55 papaul: rebooting msw-d3-codfw * 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028 * 16:52 papaul: rebooting msw-c7-codfw * 16:51 papaul: rebooting msw-c6-codfw * 16:48 papaul: rebooting msw-b7-codfw * 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet * 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet * 16:43 papaul: rebooting msw-b6-codfw * 16:40 papaul: rebooting msw-a1-codfw * 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031 * 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet * 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031 * 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031 * 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031 * 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031 * 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030 * 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031 * 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030 * 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029 * 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028 * 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002" * 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002" * 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox * 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables * 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables * 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling * 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json * 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling * 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json * 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json * 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance * 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json * 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json * 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet * 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet * 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]] * 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet * 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json * 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet * 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet * 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet * 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet * 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet * 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet * 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master * 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet * 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors * 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors * 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s) * 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json * 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet * 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet * 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors * 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors * 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master * 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment * 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet * 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki * 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]] * 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet * 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet * 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] * 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet * 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet * 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet * 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet * 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad) * 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet * 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet * 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet * 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json * 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance * 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json * 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet * 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed * 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet * 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet * 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet * 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet * 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad) * 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet * 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad * 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet * 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet * 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json * 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet * 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors * 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors * 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet * 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet * 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet * 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet * 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet * 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet * 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet * 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json * 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors * 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors * 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki * 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet * 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet * 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet * 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet * 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet * 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet * 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json * 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet * 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet * 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet * 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet * 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet * 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet * 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet * 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet * 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance * 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet * 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json * 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance * 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json * 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet * 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet * 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet * 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet * 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed * 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet * 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet * 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet * 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie * 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply * 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet * 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json * 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet * 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet * 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet * 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet * 13:51 Lucas_WMDE: UTC afternoon backport+config window done * 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s) * 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json * 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet * 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply * 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment * 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply * 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes * 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage * 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply * 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] * 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply * 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet * 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet * 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet * 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s) * 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply * 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage * 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet * 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance * 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json * 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet * 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance * 13:37 dbrant@deploy1003: dbrant: Continuing with deployment * 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet * 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet * 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] * 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet * 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet * 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet * 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s) * 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json * 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance * 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json * 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet * 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet * 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet * 13:27 sbisson@deploy1003: sbisson: Continuing with deployment * 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance * 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet * 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet * 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie * 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet * 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet * 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] * 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet * 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet * 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet * 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json * 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet * 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s) * 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet * 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie * 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet * 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling * 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet * 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet * 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment * 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet * 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet * 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . * 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . * 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json * 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet * 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' . * 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' . * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' . * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' . * 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet * 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' . * 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . * 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' . * 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] * 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet * 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp * 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet * 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json * 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]] * 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet * 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet * 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet * 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json * 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet * 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage * 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet * 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet * 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json * 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]] * 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet * 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet * 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp * 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet * 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet * 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage * 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet * 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad * 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet * 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp * 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet * 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json * 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance * 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling * 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet * 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s) * 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json * 12:39 kharlan@deploy1003: kharlan: Continuing with deployment * 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet * 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet * 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie * 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp * 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] * 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie * 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]]) * 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet * 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet * 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply * 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply * 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet * 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. * 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json * 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance * 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json * 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply * 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply * 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. * 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker * 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet * 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet * 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet * 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply * 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply * 12:21 moritzm: installing nginx security updates * 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet * 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance * 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage * 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet * 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance * 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance * 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance * 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance * 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance * 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json * 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet * 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet * 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage * 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet * 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet * 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet * 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet * 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json * 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet * 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet * 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet * 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet * 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet * 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet * 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie * 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet * 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org * 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json * 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet * 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org * 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}} * 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json * 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance * 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet * 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet * 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json * 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet * 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet * 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet * 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org * 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet * 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet * 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet * 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . * 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet * 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org * 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet * 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet * 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json * 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet * 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet * 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet * 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet * 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet * 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet * 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet * 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet * 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker * 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet * 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet * 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet * 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet * 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet * 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet * 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet * 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json * 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet * 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet * 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet * 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet * 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet * 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet * 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet * 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet * 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet * 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json * 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet * 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad * 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet * 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet * 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json * 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance * 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json * 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet * 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet * 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad * 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet * 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet * 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw * 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet * 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage * 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json * 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet * 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet * 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet * 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage * 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw * 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet * 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet * 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json * 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet * 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet * 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s) * 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet * 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 10:39 jiji@deploy1003: jiji: Continuing with deployment * 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json * 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] * 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet * 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet * 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet * 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . * 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet * 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices * 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' . * 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json * 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance * 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet * 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json * 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet * 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet * 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet * 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet * 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet * 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet * 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json * 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' . * 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet * 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet * 10:12 moritzm: installing postgresql security updates * 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet * 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet * 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org * 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . * 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet * 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet * 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet * 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet * 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json * 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet * 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet * 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet * 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet * 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet * 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org * 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet * 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org * 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet * 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw * 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet * 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet * 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet * 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet * 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet * 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json * 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet * 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org * 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply * 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet * 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet * 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply * 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet * 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet * 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply * 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet * 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet * 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet * 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet * 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet * 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json * 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet * 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance * 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json * 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet * 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]] * 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet * 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet * 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet * 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet * 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad * 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet * 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet * 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet * 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet * 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet * 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet * 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet * 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet * 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet * 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet * 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet * 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet * 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet * 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet * 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json * 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet * 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet * 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet * 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet * 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet * 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet * 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet * 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet * 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet * 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet * 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet * 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet * 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet * 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet * 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet * 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet * 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet * 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet * 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet * 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet * 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet * 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad * 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw * 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet * 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet * 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json * 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet * 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet * 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet * 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet * 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet * 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw * 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A * 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet * 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet * 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet * 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad * 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet * 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet * 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet * 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet * 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw * 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A * 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]] * 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json * 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet * 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet * 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet * 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet * 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling * 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet * 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet * 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json * 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance * 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet * 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet * 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet * 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed * 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet * 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling * 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json * 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] * 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json * 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance * 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed * 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet * 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie * 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp * 07:51 marostegui@dns1004: END - running authdns-update * 07:50 marostegui@dns1004: START - running authdns-update * 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]] * 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet * 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet * 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain * 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain * 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage * 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd * 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage * 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet * 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd * 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie * 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet * 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet * 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain * 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain * 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting * 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd * 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd * 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain * 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain * 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd * 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org * 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org * 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet * 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org * 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org * 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet * 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd * 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003 * 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003 * 06:15 marostegui@dns1004: END - running authdns-update * 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]] * 06:13 marostegui@dns1004: START - running authdns-update * 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json * 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json * 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json * 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning * 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip * 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s) * 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet * 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet * 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 == 2026-05-20 == * 23:32 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] (duration: 06m 37s) * 23:28 ladsgroup@deploy1003: ladsgroup, jdlrobson: Continuing with deployment * 23:27 ladsgroup@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] * 23:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] (duration: 08m 35s) * 23:14 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment * 23:12 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:10 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] * 23:07 Amir1: wikiadmin2023@10.64.48.159(svwiki)> delete from user_properties where up_value = '2' and up_property = 'thumbsize'; Query OK, 215 rows affected (0.018 sec) ([[phab:T426880|T426880]] and [[phab:T376152|T376152]]) * 23:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] (duration: 07m 09s) * 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Continuing with deployment * 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] * 22:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2027.codfw.wmnet * 22:36 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2027.codfw.wmnet * 22:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2026.codfw.wmnet * 22:33 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 22:29 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2026.codfw.wmnet * 22:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2017.codfw.wmnet * 22:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2017.codfw.wmnet * 22:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2020.codfw.wmnet * 21:59 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 21:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6009.drmrs.wmnet * 21:58 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 21:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6001.drmrs.wmnet * 21:58 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2020.codfw.wmnet * 21:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2019.codfw.wmnet * 21:55 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs * 21:52 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2019.codfw.wmnet * 21:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2018.codfw.wmnet * 21:51 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox) * 21:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org * 21:48 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 21:47 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp * 21:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2018.codfw.wmnet * 21:40 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1026.eqiad.wmnet * 21:37 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org * 21:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1026.eqiad.wmnet * 21:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1025.eqiad.wmnet * 21:27 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1025.eqiad.wmnet * 21:26 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_ulsfo and A:cp * 21:26 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4052.ulsfo.wmnet * 21:22 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org * 21:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1022.eqiad.wmnet * 21:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2012.codfw.wmnet with OS bullseye * 21:14 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1022.eqiad.wmnet * 21:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1021.eqiad.wmnet * 21:08 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org * 21:07 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1021.eqiad.wmnet * 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1020.eqiad.wmnet * 21:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1020.eqiad.wmnet * 21:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1019.eqiad.wmnet * 21:00 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] (duration: 10m 45s) * 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage * 20:54 dancy@deploy1003: codenamenoreste, dancy: Continuing with deployment * 20:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1019.eqiad.wmnet * 20:53 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1018.eqiad.wmnet * 20:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage * 20:53 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org * 20:52 dancy@deploy1003: codenamenoreste, dancy: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:49 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] * 20:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4051.ulsfo.wmnet * 20:46 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1018.eqiad.wmnet * 20:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1017.eqiad.wmnet * 20:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1017.eqiad.wmnet * 20:39 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1016.eqiad.wmnet * 20:37 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 20:36 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org * 20:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2012.codfw.wmnet with OS bullseye * 20:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 20:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1016.eqiad.wmnet * 20:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1015.eqiad.wmnet * 20:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1015.eqiad.wmnet * 20:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1014.eqiad.wmnet * 20:21 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org * 20:20 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1014.eqiad.wmnet * 20:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1013.eqiad.wmnet * 20:19 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 20:18 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 20:16 dwisehaupt@dns1005: END - running authdns-update * 20:16 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 20:15 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 20:15 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Maintenance * 20:15 dwisehaupt@dns1005: START - running authdns-update * 20:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1013.eqiad.wmnet * 20:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1012.eqiad.wmnet * 20:13 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp * 20:13 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4044.ulsfo.wmnet * 20:12 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 20:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 20:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4050.ulsfo.wmnet * 20:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1012.eqiad.wmnet * 20:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1011.eqiad.wmnet * 20:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org * 20:02 brouberol@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 20:01 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 20:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1011.eqiad.wmnet * 19:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org * 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet * 19:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org * 19:39 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet * 19:39 ejegg@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] (duration: 30m 19s) * 19:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4043.ulsfo.wmnet * 19:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4049.ulsfo.wmnet * 19:27 ejegg@deploy1003: ejegg: Continuing with deployment * 19:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org * 19:26 ejegg@deploy1003: ejegg: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 19:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org * 19:09 ejegg@deploy1003: Started scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] * 19:03 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 19:02 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 18:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org * 18:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4042.ulsfo.wmnet * 18:49 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] (duration: 07m 28s) * 18:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4048.ulsfo.wmnet * 18:45 reedy@deploy1003: reedy: Continuing with deployment * 18:43 reedy@deploy1003: reedy: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 18:41 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] * 18:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org * 18:38 dwisehaupt@dns1004: END - running authdns-update * 18:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie * 18:36 dwisehaupt@dns1004: START - running authdns-update * 18:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org * 18:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org * 18:09 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4041.ulsfo.wmnet * 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4047.ulsfo.wmnet * 18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2025.codfw.wmnet * 18:01 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]] * 18:01 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]] * 17:57 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2025.codfw.wmnet * 17:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2022.codfw.wmnet * 17:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org * 17:56 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6 * 17:56 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6 * 17:51 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2022.codfw.wmnet * 17:51 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2021.codfw.wmnet * 17:45 swfrench@deploy1003: Finished scap sync-world: Rebuild to pick up new production image (duration: 28m 32s) * 17:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2021.codfw.wmnet * 17:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2015.codfw.wmnet * 17:43 hnowlan: disabled puppet on grafana* to temporarily fix file ownership issue on /etc/grafana/provisioning/plugins/mahendrapaipuri-dashboardreporter-app.yaml * 17:42 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org * 17:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2015.codfw.wmnet * 17:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2014.codfw.wmnet * 17:32 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2014.codfw.wmnet * 17:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2013.codfw.wmnet * 17:28 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs * 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-codfw * 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2003.codfw.wmnet * 17:28 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2003.codfw.wmnet * 17:27 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org * 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet * 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4040.ulsfo.wmnet * 17:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2013.codfw.wmnet * 17:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2012.codfw.wmnet * 17:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2003.codfw.wmnet * 17:20 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad * 17:19 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2012.codfw.wmnet * 17:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2011.codfw.wmnet * 17:19 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org * 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2003.codfw.wmnet * 17:17 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2002.codfw.wmnet * 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2002.codfw.wmnet * 17:17 swfrench@deploy1003: Started scap sync-world: Rebuild to pick up new production image * 17:14 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6 with reason: switch refresh * 17:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2011.codfw.wmnet * 17:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2010.codfw.wmnet * 17:10 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2002.codfw.wmnet * 17:10 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt * 17:10 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt * 17:08 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 17:07 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica * 17:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2010.codfw.wmnet * 17:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2008.codfw.wmnet * 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2002.codfw.wmnet * 17:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2001.codfw.wmnet * 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2001.codfw.wmnet * 17:04 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org * 17:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica * 17:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica * 17:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2008.codfw.wmnet * 17:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2007.codfw.wmnet * 16:59 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2001.codfw.wmnet * 16:58 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica * 16:54 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica * 16:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2007.codfw.wmnet * 16:53 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2001.codfw.wmnet * 16:53 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-codfw * 16:50 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica * 16:49 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org * 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet * 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet * 16:43 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad * 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1325-1327].eqiad.wmnet * 16:39 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1325-1327].eqiad.wmnet * 16:38 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply * 16:37 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply * 16:37 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_ulsfo and A:cp * 16:36 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp * 16:34 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org * 16:33 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev * 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1325-1327].eqiad.wmnet * 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1325-1327].eqiad.wmnet * 16:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1322-1324].eqiad.wmnet * 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1322-1324].eqiad.wmnet * 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-eqiad * 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1028.eqiad.wmnet * 16:29 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1028.eqiad.wmnet * 16:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet * 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1322-1324].eqiad.wmnet * 16:20 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org * 16:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1322-1324].eqiad.wmnet * 16:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1318-1321].eqiad.wmnet * 16:19 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1318-1321].eqiad.wmnet * 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet * 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1257: Migration of db1257.eqiad.wmnet completed * 16:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1318-1321].eqiad.wmnet * 16:10 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt with reason: switch refresh * 16:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1318-1321].eqiad.wmnet * 16:10 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] (duration: 09m 12s) * 16:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1314-1317].eqiad.wmnet * 16:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1314-1317].eqiad.wmnet * 16:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet * 16:07 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1015.eqiad.wmnet with OS trixie * 16:07 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt * 16:07 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt * 16:05 urbanecm@deploy1003: urbanecm, mszwarc: Continuing with deployment * 16:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org * 16:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev * 16:02 urbanecm@deploy1003: urbanecm, mszwarc: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 16:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1314-1317].eqiad.wmnet * 16:00 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] * 16:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1314-1317].eqiad.wmnet * 15:59 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1310-1313].eqiad.wmnet * 15:59 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1310-1313].eqiad.wmnet * 15:59 btullis@cumin1003: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema * 15:57 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=eqiad * 15:56 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet * 15:52 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1027.eqiad.wmnet * 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1027.eqiad.wmnet * 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1023.eqiad.wmnet * 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1024.eqiad.wmnet * 15:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org * 15:51 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox) * 15:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1310-1313].eqiad.wmnet * 15:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox) * 15:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org * 15:50 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 15:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage * 15:49 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1310-1313].eqiad.wmnet * 15:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet * 15:48 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306-1309].eqiad.wmnet * 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306-1309].eqiad.wmnet * 15:46 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp[2041-2042].codfw.wmnet * 15:46 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 15:46 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002" * 15:46 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1027.eqiad.wmnet * 15:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002" * 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1027.eqiad.wmnet * 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1023.eqiad.wmnet * 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1026.eqiad.wmnet * 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1026.eqiad.wmnet * 15:45 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1024.eqiad.wmnet * 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet * 15:44 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=eqiad * 15:44 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw * 15:44 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage * 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2024.codfw.wmnet * 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2023.codfw.wmnet * 15:43 btullis@cumin1003: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema * 15:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org * 15:41 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox) * 15:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2016.codfw.wmnet * 15:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1306-1309].eqiad.wmnet * 15:38 brett@cumin2002: START - Cookbook sre.dns.netbox * 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1026.eqiad.wmnet * 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2024.codfw.wmnet * 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2023.codfw.wmnet * 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1306-1309].eqiad.wmnet * 15:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1302-1305].eqiad.wmnet * 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1302-1305].eqiad.wmnet * 15:35 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2016.codfw.wmnet * 15:32 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2041-2042].codfw.wmnet * 15:32 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply * 15:30 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply * 15:30 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1015.eqiad.wmnet with OS trixie * 15:29 moritzm: failover Ganeti master in codfw02 to ganeti2033 * 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1302-1305].eqiad.wmnet * 15:27 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99) * 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1257: Migration of db1257.eqiad.wmnet completed * 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1302-1305].eqiad.wmnet * 15:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1298-1301].eqiad.wmnet * 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1298-1301].eqiad.wmnet * 15:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 15:25 bking@cumin2002: START - Cookbook sre.wdqs.reboot * 15:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1257.eqiad.wmnet with OS trixie * 15:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 15:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1014.eqiad.wmnet with OS trixie * 15:20 hashar: Restarted Jenkins CI due to Java upgrade which causes integration/pipelinelib to not be loadable. * 15:17 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 15:15 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1298-1301].eqiad.wmnet * 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet * 15:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet * 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1298-1301].eqiad.wmnet * 15:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1294-1297].eqiad.wmnet * 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1294-1297].eqiad.wmnet * 15:09 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw * 15:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage * 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1026.eqiad.wmnet * 15:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1025.eqiad.wmnet * 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1025.eqiad.wmnet * 15:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet * 15:04 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes * 15:04 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage * 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1294-1297].eqiad.wmnet * 15:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage * 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet * 15:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1294-1297].eqiad.wmnet * 15:00 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet * 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet * 14:57 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart * 14:57 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage * 14:56 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1025.eqiad.wmnet * 14:56 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1025.eqiad.wmnet * 14:54 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1024.eqiad.wmnet * 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1024.eqiad.wmnet * 14:53 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 14:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet * 14:51 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:50 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:49 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1257.eqiad.wmnet with OS trixie * 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet * 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet * 14:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1002.eqiad.wmnet * 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet * 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1257: Upgrading db1257.eqiad.wmnet * 14:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1257: Upgrading db1257.eqiad.wmnet * 14:43 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt with reason: switch refresh * 14:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 14:43 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:42 moritzm: installing rsync security updates * 14:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet * 14:42 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1014.eqiad.wmnet with OS trixie * 14:42 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]] * 14:42 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]] * 14:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1002.eqiad.wmnet * 14:40 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1024.eqiad.wmnet * 14:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet * 14:39 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99) * 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet * 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet * 14:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1024.eqiad.wmnet * 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1019.eqiad.wmnet * 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1019.eqiad.wmnet * 14:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage * 14:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet * 14:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage * 14:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet * 14:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage * 14:27 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1013.eqiad.wmnet with OS trixie * 14:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage * 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad) * 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet * 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet * 14:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:22 dancy@deploy1003: Installation of scap version "4.266.0" completed for 2 hosts * 14:22 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002 * 14:21 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 14:21 dancy@deploy1003: Installing scap version "4.266.0" for 2 host(s) * 14:19 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1019.eqiad.wmnet * 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet * 14:17 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudelastic1010.eqiad.wmnet * 14:16 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS trixie * 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1019.eqiad.wmnet * 14:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1018.eqiad.wmnet * 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1018.eqiad.wmnet * 14:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet * 14:12 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 14:11 bjensen: uploaded trixie-packaged memkeys on apt1002 * 14:10 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply * 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply * 14:09 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage * 14:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1018.eqiad.wmnet * 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet * 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet * 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Migration of db1258.eqiad.wmnet completed * 14:04 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage * 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1018.eqiad.wmnet * 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1017.eqiad.wmnet * 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1017.eqiad.wmnet * 14:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1010.eqiad.wmnet * 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet * 14:00 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad) * 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1017.eqiad.wmnet * 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1017.eqiad.wmnet * 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1016.eqiad.wmnet * 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1016.eqiad.wmnet * 13:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp * 13:53 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet * 13:53 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage * 13:52 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes * 13:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1011.eqiad.wmnet * 13:49 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage * 13:49 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1013.eqiad.wmnet with OS trixie * 13:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1016.eqiad.wmnet * 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1016.eqiad.wmnet * 13:48 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1015.eqiad.wmnet * 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1015.eqiad.wmnet * 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet * 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet * 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet * 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet * 13:43 Lucas_WMDE: UTC afternoon backport+config window done * 13:43 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1011.eqiad.wmnet * 13:43 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] (duration: 06m 39s) * 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1015.eqiad.wmnet * 13:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet * 13:41 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1012.eqiad.wmnet with OS trixie * 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1015.eqiad.wmnet * 13:41 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1014.eqiad.wmnet * 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1014.eqiad.wmnet * 13:38 reedy@deploy1003: reedy: Continuing with deployment * 13:38 moritzm: installing krb5 security updates * 13:38 reedy@deploy1003: reedy: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet * 13:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet * 13:37 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1056 * 13:36 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] * 13:35 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1056 * 13:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1012.eqiad.wmnet * 13:31 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2183 * 13:31 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2183 * 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet * 13:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet * 13:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 13:29 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:28 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet * 13:27 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2183 * 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 13:27 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:27 root@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003" * 13:26 root@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003" * 13:26 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] (duration: 07m 26s) * 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A * 13:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 13:24 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002 * 13:24 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1014.eqiad.wmnet * 13:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage * 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet * 13:22 root@cumin1003: START - Cookbook sre.dns.netbox * 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet * 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1014.eqiad.wmnet * 13:22 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment * 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1013.eqiad.wmnet * 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1013.eqiad.wmnet * 13:20 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 13:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:19 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 13:19 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:19 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Migration of db1258.eqiad.wmnet completed * 13:19 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage * 13:18 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2183 * 13:18 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] * 13:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 13:18 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply * 13:18 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS trixie * 13:18 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:18 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.update-replication (exit_code=97) * 13:17 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A * 13:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1258.eqiad.wmnet with OS trixie * 13:17 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A * 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A * 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet * 13:15 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] (duration: 07m 55s) * 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1013.eqiad.wmnet * 13:15 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1013.eqiad.wmnet * 13:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1012.eqiad.wmnet * 13:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1012.eqiad.wmnet * 13:13 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply * 13:12 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet * 13:11 sbisson@deploy1003: sbisson: Continuing with deployment * 13:09 sbisson@deploy1003: sbisson: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply * 13:07 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] * 13:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet * 13:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply * 13:04 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1012.eqiad.wmnet with OS trixie * 13:03 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1012.eqiad.wmnet * 13:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply * 13:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp * 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet * 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet * 13:01 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] (duration: 09m 31s) * 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1012.eqiad.wmnet * 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1011.eqiad.wmnet * 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1011.eqiad.wmnet * 13:00 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply * 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage * 12:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp * 12:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet * 12:58 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet * 12:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet * 12:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet * 12:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet * 12:57 kharlan@deploy1003: kharlan: Continuing with deployment * 12:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet * 12:56 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet * 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1288].eqiad.wmnet * 12:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1288].eqiad.wmnet * 12:55 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie * 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1011.eqiad.wmnet * 12:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage * 12:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet * 12:52 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] * 12:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1011.eqiad.wmnet * 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1010.eqiad.wmnet * 12:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1010.eqiad.wmnet * 12:48 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[2183-2184].codfw.wmnet with reason: restart * 12:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1285-1288].eqiad.wmnet * 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1285-1288].eqiad.wmnet * 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1281-1284].eqiad.wmnet * 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1281-1284].eqiad.wmnet * 12:44 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org * 12:43 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1010.eqiad.wmnet * 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet * 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1010.eqiad.wmnet * 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet * 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1009.eqiad.wmnet * 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1009.eqiad.wmnet * 12:42 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1011.eqiad.wmnet with OS trixie * 12:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 12:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 12:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1281-1284].eqiad.wmnet * 12:37 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 12:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet * 12:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1258.eqiad.wmnet with OS trixie * 12:36 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org * 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1009.eqiad.wmnet * 12:35 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org * 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1281-1284].eqiad.wmnet * 12:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1277-1280].eqiad.wmnet * 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1277-1280].eqiad.wmnet * 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet * 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet * 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 12:33 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99) * 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART * 12:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99) * 12:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 12:26 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org * 12:26 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] (duration: 08m 37s) * 12:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet * 12:25 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage * 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1277-1280].eqiad.wmnet * 12:22 kharlan@deploy1003: kharlan: Continuing with deployment * 12:22 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1277-1280].eqiad.wmnet * 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1273-1276].eqiad.wmnet * 12:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1273-1276].eqiad.wmnet * 12:19 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:17 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] * 12:17 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage * 12:16 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet * 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet * 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet * 12:15 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes * 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1273-1276].eqiad.wmnet * 12:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1273-1276].eqiad.wmnet * 12:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1269-1272].eqiad.wmnet * 12:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1269-1272].eqiad.wmnet * 12:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet * 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2162: Migration of db2162.codfw.wmnet completed * 12:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet * 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1009.eqiad.wmnet * 12:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1008.eqiad.wmnet * 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1008.eqiad.wmnet * 12:04 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad * 12:04 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp * 12:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1269-1272].eqiad.wmnet * 12:02 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1011.eqiad.wmnet with OS trixie * 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1269-1272].eqiad.wmnet * 12:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1265-1268].eqiad.wmnet * 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1265-1268].eqiad.wmnet * 11:59 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp * 11:59 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet * 11:58 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] (duration: 07m 04s) * 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet * 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet * 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99) * 11:55 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99) * 11:54 mszwarc@deploy1003: mszwarc: Continuing with deployment * 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1265-1268].eqiad.wmnet * 11:53 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 11:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet * 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1008.eqiad.wmnet * 11:51 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] * 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1265-1268].eqiad.wmnet * 11:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1261-1264].eqiad.wmnet * 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1261-1264].eqiad.wmnet * 11:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1008.eqiad.wmnet * 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1007.eqiad.wmnet * 11:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1007.eqiad.wmnet * 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet * 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet * 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet * 11:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet * 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet * 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 11:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1007.eqiad.wmnet * 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet * 11:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1261-1264].eqiad.wmnet * 11:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet * 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet * 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet * 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie * 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003" * 11:39 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003" * 11:39 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1010.eqiad.wmnet with OS trixie * 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1261-1264].eqiad.wmnet * 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1257-1260].eqiad.wmnet * 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1257-1260].eqiad.wmnet * 11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet * 11:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet * 11:31 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1257-1260].eqiad.wmnet * 11:29 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1257-1260].eqiad.wmnet * 11:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1253-1256].eqiad.wmnet * 11:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1253-1256].eqiad.wmnet * 11:25 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A * 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A * 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2162: Migration of db2162.codfw.wmnet completed * 11:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage * 11:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2162.codfw.wmnet with OS trixie * 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet * 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet * 11:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1253-1256].eqiad.wmnet * 11:20 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A * 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A * 11:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage * 11:19 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1055 * 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1253-1256].eqiad.wmnet * 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1249-1252].eqiad.wmnet * 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1249-1252].eqiad.wmnet * 11:18 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1055 * 11:17 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet * 11:17 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage * 11:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet * 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet * 11:16 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage * 11:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet * 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1007.eqiad.wmnet * 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1006.eqiad.wmnet * 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1006.eqiad.wmnet * 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet * 11:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1249-1252].eqiad.wmnet * 11:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp * 11:06 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1249-1252].eqiad.wmnet * 11:05 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp * 11:05 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7014.magru.wmnet * 11:05 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw * 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2011.codfw.wmnet * 11:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2011.codfw.wmnet * 11:05 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie * 11:05 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad * 11:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: host reimage * 11:04 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes * 11:01 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw * 11:01 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1010.eqiad.wmnet with OS trixie * 11:01 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2011.codfw.wmnet * 11:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: host reimage * 10:59 moritzm: failover Ganeti cluster in codfw to ganeti2048 * 10:57 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes * 10:56 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2011.codfw.wmnet * 10:56 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2010.codfw.wmnet * 10:55 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2010.codfw.wmnet * 10:53 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1006.eqiad.wmnet * 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1006.eqiad.wmnet * 10:51 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1005.eqiad.wmnet * 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1005.eqiad.wmnet * 10:51 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . * 10:51 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2010.codfw.wmnet * 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2010.codfw.wmnet * 10:46 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2009.codfw.wmnet * 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2009.codfw.wmnet * 10:44 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage * 10:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2162.codfw.wmnet with OS trixie * 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet * 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet * 10:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2009.codfw.wmnet * 10:40 fnegri@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage * 10:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2162: Upgrading db2162.codfw.wmnet * 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2162: Upgrading db2162.codfw.wmnet * 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2009.codfw.wmnet * 10:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2008.codfw.wmnet * 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2008.codfw.wmnet * 10:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet * 10:32 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1005.eqiad.wmnet * 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2242: Migration of db2242.codfw.wmnet completed * 10:31 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2008.codfw.wmnet * 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1005.eqiad.wmnet * 10:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet * 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1004.eqiad.wmnet * 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1004.eqiad.wmnet * 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie * 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet * 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet * 10:27 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie * 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet * 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1004.eqiad.wmnet * 10:23 slyngshede@dns1004: END - running authdns-update * 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1004.eqiad.wmnet * 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1003.eqiad.wmnet * 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1003.eqiad.wmnet * 10:22 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw * 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet * 10:21 slyngshede@dns1004: START - running authdns-update * 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2008.codfw.wmnet * 10:20 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2007.codfw.wmnet * 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2007.codfw.wmnet * 10:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage * 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet * 10:14 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2007.codfw.wmnet * 10:13 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance * 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage * 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet * 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp * 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet * 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1003.eqiad.wmnet * 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1003.eqiad.wmnet * 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1002.eqiad.wmnet * 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1002.eqiad.wmnet * 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2007.codfw.wmnet * 10:09 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2006.codfw.wmnet * 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2006.codfw.wmnet * 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet * 10:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet * 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dborch1002.wikimedia.org with reason: Reboot * 10:04 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1002.eqiad.wmnet * 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1002.eqiad.wmnet * 10:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1001.eqiad.wmnet * 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1001.eqiad.wmnet * 10:02 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie * 10:02 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2006.codfw.wmnet * 09:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T415165|T415165]] * 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2006.codfw.wmnet * 09:57 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet * 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet * 09:56 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet * 09:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1001.eqiad.wmnet * 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 09:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 09:52 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet * 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet * 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet * 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet * 09:47 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2004.codfw.wmnet * 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2004.codfw.wmnet * 09:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2242: Migration of db2242.codfw.wmnet completed * 09:44 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply * 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply * 09:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2242.codfw.wmnet with OS trixie * 09:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet * 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2004.codfw.wmnet * 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet * 09:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet * 09:37 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet * 09:37 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet * 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2004.codfw.wmnet * 09:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2003.codfw.wmnet * 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2003.codfw.wmnet * 09:33 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet * 09:33 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet * 09:32 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2003.codfw.wmnet * 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2031.codfw.wmnet * 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2031.codfw.wmnet * 09:30 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet * 09:30 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet * 09:28 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes * 09:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp * 09:28 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet * 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org * 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2003.codfw.wmnet * 09:27 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2002.codfw.wmnet * 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2002.codfw.wmnet * 09:26 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1001.eqiad.wmnet * 09:26 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-eqiad * 09:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet * 09:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2242.codfw.wmnet with reason: host reimage * 09:23 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet * 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org * 09:19 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2242.codfw.wmnet with reason: host reimage * 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org * 09:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet * 09:18 moritzm: temporarily drop ganeti2030 from the codfw cluster [[phab:T426199|T426199]] * 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet * 09:15 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2002.codfw.wmnet * 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: Migration of db2187.codfw.wmnet completed * 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org * 09:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 09:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5004.wikimedia.org * 09:07 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12389 * 09:06 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 12389 * 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 207947 * 09:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2002.codfw.wmnet * 09:04 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 09:04 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet * 09:04 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet * 09:04 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 207947 * 09:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2242.codfw.wmnet with OS trixie * 09:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2242: Upgrading db2242.codfw.wmnet * 09:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2242: Upgrading db2242.codfw.wmnet * 09:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 09:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5004.wikimedia.org * 08:59 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet * 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4004.wikimedia.org * 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply * 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply * 08:53 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet * 08:53 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw * 08:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4004.wikimedia.org * 08:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet * 08:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply * 08:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply * 08:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp * 08:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2187: Migration of db2187.codfw.wmnet completed * 08:26 moritzm: installing Java 11 security updates * 08:26 hashar@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] * 08:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2187.codfw.wmnet with OS trixie * 08:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org * 08:24 moritzm: imported openjdk-8u492-ga-1~deb11u1 to component/jdk8 for bookworm (forward port of latest Java 8 security release) * 08:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2006-2008].codfw.wmnet with reason: Reboot * 08:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2005-2008].codfw.wmnet with reason: Reboot * 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org * 08:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1029.eqiad.wmnet with reason: Reboot * 08:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1026.eqiad.wmnet with reason: Reboot * 08:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet * 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org * 08:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2187.codfw.wmnet with reason: host reimage * 08:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2187.codfw.wmnet with reason: host reimage * 08:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org * 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1023.eqiad.wmnet with reason: Reboot * 07:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet * 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet * 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet * 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: Reboot * 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet * 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet * 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet * 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet * 07:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1024.eqiad.wmnet with reason: Reboot * 07:46 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet * 07:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet * 07:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet * 07:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet * 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2187.codfw.wmnet with OS trixie * 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet * 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet * 07:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet * 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet * 07:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet * 07:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host seaborgium.wikimedia.org * 07:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet * 07:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host seaborgium.wikimedia.org * 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet * 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet * 07:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet * 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet * 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet * 07:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org * 07:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1249: Repooling after boot * 07:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Upgrading db2187.codfw.wmnet * 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Upgrading db2187.codfw.wmnet * 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet * 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org * 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet * 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet * 07:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet * 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet * 07:24 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] (duration: 06m 28s) * 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet * 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet * 07:19 mlitn@deploy1003: mlitn: Continuing with deployment * 07:19 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet * 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet * 07:17 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] * 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet * 07:15 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] (duration: 10m 51s) * 07:11 mlitn@deploy1003: mlitn: Continuing with deployment * 07:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet * 07:09 moritzm: remove haveged * 07:06 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet * 07:04 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] * 06:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet * 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet * 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet * 06:46 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1249.eqiad.wmnet * 06:46 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1249.eqiad.wmnet * 06:45 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1249: Repooling after boot * 06:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet * 06:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet * 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1011.eqiad.wmnet * 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 06:34 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 06:33 moritzm: failover Ganeti cluster in drmrs02 to ganeti6002 * 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet * 06:31 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2030.codfw.wmnet * 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet * 06:31 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart * 06:27 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1011.eqiad.wmnet * 06:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart * 06:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet with reason: restart * 05:41 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1011 from dbctl [[phab:T426806|T426806]]', diff saved to https://phabricator.wikimedia.org/P92647 and previous config saved to /var/cache/conftool/dbconfig/20260520-054146-marostegui.json * 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012.eqiad.wmnet: Maintenance on pc2 * 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0) * 05:28 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1012.eqiad.wmnet: Maintenance on pc2 * 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2022.codfw.wmnet,pc[1012,1022].eqiad.wmnet with reason: Maintenance on pc1 * 02:18 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 02:18 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 02:17 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 02:17 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 02:17 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 02:17 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 01:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.* * 01:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp7011.magru.wmnet * 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie * 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003" * 01:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir7003.* * 01:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp7011.magru.wmnet * 01:23 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp7011.* * 01:10 pt1979@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003" * 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] (duration: 08m 10s) * 00:54 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage * 00:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 00:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:47 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage * 00:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] * 00:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] (duration: 06m 33s) * 00:39 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment * 00:38 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:36 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] * 00:05 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie == 2026-05-19 == * 23:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2+icu72u1 into component/php83-icu72 * 22:39 brett: disabling pybal/puppet on lvs2012 due to hardware misconfiguration/failure - [[phab:T425890|T425890]] * 22:18 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: MD RAID failure * 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2012.codfw.wmnet * 22:16 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2012.codfw.wmnet * 21:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:46 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 21:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037 * 21:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037 * 21:34 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2014.codfw.wmnet * 21:19 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037 * 21:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:18 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037 * 21:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:16 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs1036 * 21:16 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1036 * 21:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet * 21:13 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:11 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:10 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕔🍺 sudo -i reprepro -C main --ignore=wrongdistribution copy bookworm-wikimedia trixie-wikimedia cidergrinder * 21:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003" * 21:08 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003" * 21:04 jclark@cumin1003: START - Cookbook sre.dns.netbox * 20:55 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] (duration: 06m 40s) * 20:51 sbassett@deploy1003: sbassett: Continuing with deployment * 20:51 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica * 20:51 sbassett@deploy1003: sbassett: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:49 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] * 20:47 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica * 20:40 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] (duration: 08m 12s) * 20:36 ebernhardson@deploy1003: ebernhardson: Continuing with deployment * 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp * 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet * 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp * 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet * 20:34 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:32 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] * 20:29 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir * 20:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1245-1248].eqiad.wmnet * 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1245-1248].eqiad.wmnet * 20:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1241-1244].eqiad.wmnet * 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1241-1244].eqiad.wmnet * 20:16 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] (duration: 08m 25s) * 20:13 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1241-1244].eqiad.wmnet * 20:12 tgr@deploy1003: cscott, tgr: Continuing with deployment * 20:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1241-1244].eqiad.wmnet * 20:10 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet * 20:10 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet * 20:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet * 20:10 tgr@deploy1003: cscott, tgr: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:08 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] * 20:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet * 20:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet * 20:03 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet * 20:02 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet * 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet * 20:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1159-1162].eqiad.wmnet * 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1159-1162].eqiad.wmnet * 19:57 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] (duration: 16m 15s) * 19:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet * 19:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1159-1162].eqiad.wmnet * 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet * 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet * 19:53 otto@deploy1003: otto: Continuing with deployment * 19:51 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet * 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1159-1162].eqiad.wmnet * 19:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1155-1158].eqiad.wmnet * 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1155-1158].eqiad.wmnet * 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet * 19:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet * 19:44 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1155-1158].eqiad.wmnet * 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet * 19:43 otto@deploy1003: otto: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1155-1158].eqiad.wmnet * 19:42 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp * 19:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1151-1154].eqiad.wmnet * 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1151-1154].eqiad.wmnet * 19:41 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d] (duration: 02m 07s) * 19:41 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp * 19:41 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] * 19:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet * 19:39 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d] * 19:39 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d] (duration: 04m 35s) * 19:34 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d] * 19:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org * 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp * 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet * 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp * 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet * 19:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1151-1154].eqiad.wmnet * 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1151-1154].eqiad.wmnet * 19:30 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1147-1150].eqiad.wmnet * 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1147-1150].eqiad.wmnet * 19:28 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d] (duration: 02m 00s) * 19:26 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d] * 19:25 mutante: rebooting gitlab-replica-a.wikimedia.org * 19:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org * 19:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab1003.wikimedia.org with reason: [[phab:T426563|T426563]] * 19:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1147-1150].eqiad.wmnet * 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1147-1150].eqiad.wmnet * 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp * 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp * 19:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1143-1146].eqiad.wmnet * 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1143-1146].eqiad.wmnet * 19:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1143-1146].eqiad.wmnet * 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1143-1146].eqiad.wmnet * 19:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1139-1142].eqiad.wmnet * 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1139-1142].eqiad.wmnet * 19:06 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp * 19:06 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet * 19:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1139-1142].eqiad.wmnet * 19:03 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir * 19:02 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir-magru and A:ncredir * 19:02 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir * 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1139-1142].eqiad.wmnet * 19:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1135-1138].eqiad.wmnet * 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1135-1138].eqiad.wmnet * 18:55 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir * 18:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1135-1138].eqiad.wmnet * 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw * 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1135-1138].eqiad.wmnet * 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1131-1134].eqiad.wmnet * 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1131-1134].eqiad.wmnet * 18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir * 18:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1131-1134].eqiad.wmnet * 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1131-1134].eqiad.wmnet * 18:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1127-1130].eqiad.wmnet * 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1127-1130].eqiad.wmnet * 18:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1127-1130].eqiad.wmnet * 18:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1127-1130].eqiad.wmnet * 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1123-1126].eqiad.wmnet * 18:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1123-1126].eqiad.wmnet * 18:24 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet * 18:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1123-1126].eqiad.wmnet * 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1123-1126].eqiad.wmnet * 18:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1119-1122].eqiad.wmnet * 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1119-1122].eqiad.wmnet * 18:12 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp * 18:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1119-1122].eqiad.wmnet * 18:07 cjd91: import gdnsd_3.99.0-alpha3~deb13u1 into trixie-wikimedia-[[phab:T401832|T401832]] * 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1119-1122].eqiad.wmnet * 18:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1115-1118].eqiad.wmnet * 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1115-1118].eqiad.wmnet * 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet * 17:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet * 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1115-1118].eqiad.wmnet * 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1115-1118].eqiad.wmnet * 17:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet * 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet * 17:51 joal@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Hotfix Hadoop-test [analytics/refinery@eeef7f3d] * 17:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet * 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet * 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet * 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet * 17:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet * 17:35 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart * 17:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet * 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet * 17:33 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet * 17:30 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d] (duration: 00m 46s) * 17:30 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d] * 17:28 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet * 17:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 17:27 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 17:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 17:27 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 17:27 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 17:26 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 17:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet * 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet * 17:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1075-1078].eqiad.wmnet * 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1075-1078].eqiad.wmnet * 17:22 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet * 17:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1075-1078].eqiad.wmnet * 17:13 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet * 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1075-1078].eqiad.wmnet * 17:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1071-1074].eqiad.wmnet * 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1071-1074].eqiad.wmnet * 17:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1071-1074].eqiad.wmnet * 17:03 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet * 17:03 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1071-1074].eqiad.wmnet * 17:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1067-1070].eqiad.wmnet * 17:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1067-1070].eqiad.wmnet * 17:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[2004-2007].codfw.wmnet with reason: restart * 16:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1067-1070].eqiad.wmnet * 16:54 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backupmon1001.eqiad.wmnet with reason: restart * 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1067-1070].eqiad.wmnet * 16:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet * 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet * 16:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet * 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet * 16:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1053-1056].eqiad.wmnet * 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1053-1056].eqiad.wmnet * 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1053-1056].eqiad.wmnet * 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1053-1056].eqiad.wmnet * 16:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1049-1052].eqiad.wmnet * 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1049-1052].eqiad.wmnet * 16:28 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[1004-1007].eqiad.wmnet with reason: restart * 16:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1049-1052].eqiad.wmnet * 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1049-1052].eqiad.wmnet * 16:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1045-1048].eqiad.wmnet * 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1045-1048].eqiad.wmnet * 16:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1045-1048].eqiad.wmnet * 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1045-1048].eqiad.wmnet * 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1041-1044].eqiad.wmnet * 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1041-1044].eqiad.wmnet * 16:07 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 16:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie * 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1041-1044].eqiad.wmnet * 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1041-1044].eqiad.wmnet * 15:59 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1037-1040].eqiad.wmnet * 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1037-1040].eqiad.wmnet * 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet * 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet * 15:53 moritzm: temporarily drop ganeti2029 from the codfw cluster [[phab:T426199|T426199]] * 15:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1037-1040].eqiad.wmnet * 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet * 15:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet * 15:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet * 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2243: Migration of db2243.codfw.wmnet completed * 15:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and A:ncredir * 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet * 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet * 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1037-1040].eqiad.wmnet * 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet * 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet * 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet * 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet * 15:43 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet * 15:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet * 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet * 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet * 15:39 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy * 15:39 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet * 15:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet * 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet * 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet * 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet * 15:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet * 15:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet * 15:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet * 15:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet * 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet * 15:30 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet * 15:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet * 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet * 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet * 15:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet * 15:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet * 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet * 15:28 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw * 15:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet * 15:27 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 15:27 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet * 15:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet * 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet * 15:26 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet * 15:25 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1070.eqiad.wmnet * 15:25 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1070.eqiad.wmnet * 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet * 15:23 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet * 15:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet * 15:23 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 15:22 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2002.codfw.wmnet * 15:22 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 15:22 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 15:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet * 15:21 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet * 15:21 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 15:21 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 15:21 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 15:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2 into component/php83 * 15:20 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2002.codfw.wmnet * 15:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet * 15:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet * 15:19 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2003.codfw.wmnet * 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1055.eqiad.wmnet * 15:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1055.eqiad.wmnet * 15:17 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2003.codfw.wmnet * 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet * 15:16 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]] (duration: 00m 46s) * 15:16 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2001.codfw.wmnet * 15:16 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy * 15:16 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]] * 15:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet * 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet * 15:14 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]] (duration: 00m 49s) * 15:14 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet * 15:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1245.eqiad.wmnet with reason: restart * 15:14 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy * 15:14 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]] * 15:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet * 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet * 15:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet * 15:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet * 15:12 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2001.codfw.wmnet * 15:11 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet * 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-eqiad * 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1003.eqiad.wmnet * 15:09 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1003.eqiad.wmnet * 15:08 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet * 15:08 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet * 15:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet * 15:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet * 15:06 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet * 15:06 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet * 15:06 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet * 15:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] (duration: 08m 03s) * 15:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet * 15:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet * 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1003.eqiad.wmnet * 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1003.eqiad.wmnet * 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1002.eqiad.wmnet * 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1002.eqiad.wmnet * 15:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2243: Migration of db2243.codfw.wmnet completed * 15:02 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet * 15:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2243.codfw.wmnet with OS trixie * 15:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment * 15:00 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet * 15:00 moritzm: failover Ganeti cluster in drmrs01 to ganeti6001 * 15:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet * 14:59 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1002.eqiad.wmnet * 14:59 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1002.eqiad.wmnet * 14:59 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-eqiad * 14:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet * 14:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet * 14:57 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] * 14:57 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 14:56 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 14:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-codfw * 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2003.codfw.wmnet * 14:55 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2003.codfw.wmnet * 14:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 14:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet * 14:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet * 14:51 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 14:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet * 14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet * 14:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 09m 37s) * 14:51 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2003.codfw.wmnet * 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2003.codfw.wmnet * 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2002.codfw.wmnet * 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2002.codfw.wmnet * 14:47 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment * 14:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet * 14:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet * 14:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2002.codfw.wmnet * 14:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2002.codfw.wmnet * 14:45 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-codfw * 14:44 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet * 14:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet * 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-codfw * 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2009.codfw.wmnet * 14:44 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2009.codfw.wmnet * 14:44 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough * 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2243.codfw.wmnet with reason: host reimage * 14:43 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum * 14:43 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C * 14:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie * 14:42 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica * 14:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] * 14:41 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 14:40 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 07m 16s) * 14:40 dreamyjazz@deploy1003: dreamyjazz, kharlan: Rolling back deployment * 14:40 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy * 14:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet * 14:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet * 14:39 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet * 14:39 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2009.codfw.wmnet * 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2009.codfw.wmnet * 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2008.codfw.wmnet * 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2008.codfw.wmnet * 14:38 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica * 14:37 dreamyjazz@deploy1003: dreamyjazz, kharlan: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C * 14:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet * 14:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet * 14:36 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2243.codfw.wmnet with reason: host reimage * 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc1 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92606 and previous config saved to /var/cache/conftool/dbconfig/20260519-143632-marostegui.json * 14:35 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir * 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Make pc1021 master of pc1 in eqiad [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92605 and previous config saved to /var/cache/conftool/dbconfig/20260519-143549-marostegui.json * 14:35 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet * 14:34 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy * 14:33 kamila@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host deploy2002.codfw.wmnet * 14:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] * 14:33 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet * 14:33 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2008.codfw.wmnet * 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2008.codfw.wmnet * 14:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2007.codfw.wmnet * 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2007.codfw.wmnet * 14:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet * 14:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet * 14:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 14:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 14:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 14:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 14:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet * 14:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet * 14:27 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2007.codfw.wmnet * 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2007.codfw.wmnet * 14:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet * 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet * 14:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet * 14:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet * 14:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet * 14:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet * 14:21 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet * 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2243.codfw.wmnet with OS trixie * 14:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1204.eqiad.wmnet with OS trixie * 14:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet * 14:18 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet * 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2243: Upgrading db2243.codfw.wmnet * 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2243: Upgrading db2243.codfw.wmnet * 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade * 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet * 14:16 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2005.codfw.wmnet * 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2005.codfw.wmnet * 14:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet * 14:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet * 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2005.codfw.wmnet * 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2005.codfw.wmnet * 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2004.codfw.wmnet * 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2004.codfw.wmnet * 14:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet * 14:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet * 14:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet * 14:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet * 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2004.codfw.wmnet * 14:07 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and A:magru and (A:dnsbox) * 14:07 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org * 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2004.codfw.wmnet * 14:06 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2003.codfw.wmnet * 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2003.codfw.wmnet * 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet * 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet * 14:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet * 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2003.codfw.wmnet * 14:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet * 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2003.codfw.wmnet * 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2002.codfw.wmnet * 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2002.codfw.wmnet * 14:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 13:59 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[1204-1205].eqiad.wmnet with reason: restart/reimage * 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2002.codfw.wmnet * 13:57 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet * 13:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet * 13:57 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org * 13:57 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2002.codfw.wmnet * 13:57 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-codfw * 13:55 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 13:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet * 13:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet * 13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet * 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet * 13:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet * 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet * 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet * 13:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet * 13:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet * 13:46 root@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage * 13:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet * 13:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet * 13:42 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org * 13:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet * 13:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet * 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet * 13:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet * 13:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet * 13:37 Lucas_WMDE: UTC afternoon backport+config window done * 13:36 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage * 13:36 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] (duration: 12m 56s) * 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki1001.eqiad.wmnet * 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003" * 13:35 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003" * 13:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet * 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 13:32 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet * 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet * 13:32 cscott@deploy1003: cscott: Continuing with deployment * 13:31 elukey@cumin1003: START - Cookbook sre.dns.netbox * 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet * 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet * 13:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet * 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet * 13:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet * 13:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet * 13:28 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org * 13:28 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and A:magru and (A:dnsbox) * 13:28 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy * 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum * 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough * 13:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet * 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 13:25 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki1001.eqiad.wmnet * 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet * 13:25 cscott@deploy1003: cscott: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet * 13:23 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] * 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet * 13:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet * 13:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet * 13:21 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1204.eqiad.wmnet with OS trixie * 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet * 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet * 13:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet * 13:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet * 13:16 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] (duration: 07m 00s) * 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet * 13:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet * 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet * 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet * 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet * 13:14 cezmunsta: Removing db2143 from orchestrator [[phab:T424171|T424171]] * 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2143.codfw.wmnet * 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 13:13 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 13:13 cezmunsta: Removing db2143 from zarcillo [[phab:T424171|T424171]] * 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS trixie * 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet * 13:12 dbrant@deploy1003: dbrant: Continuing with deployment * 13:11 moritzm: failover Ganeti cluster in esams to ganeti3005 * 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet * 13:11 dbrant@deploy1003: dbrant: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet * 13:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS trixie * 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet * 13:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet * 13:09 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] * 13:08 cwilliams@cumin1003: START - Cookbook sre.dns.netbox * 12:52 cwilliams@cumin1003: START - Cookbook sre.dns.netbox * 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS trixie * 12:50 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet * 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2027.codfw.wmnet * 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet * 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage * 12:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet * 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage * 12:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet * 12:47 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2149.codfw.wmnet * 12:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet * 12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet * 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet * 12:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet * 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet * 12:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet * 12:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet * 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS trixie * 12:36 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage * 12:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage * 12:36 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2008.codfw.wmnet * 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS trixie * 12:35 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet * 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS trixie * 12:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet * 12:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet * 12:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet * 12:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet * 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3005.esams.wmnet * 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet * 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS trixie * 12:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS trixie * 12:28 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2008.codfw.wmnet * 12:28 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2006.codfw.wmnet * 12:27 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet * 12:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet * 12:26 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet * 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet * 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet * 12:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS trixie * 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage * 12:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie * 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet * 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS trixie * 12:20 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1279.eqiad.wmnet with OS trixie * 12:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet * 12:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet * 12:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet * 12:18 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2006.codfw.wmnet * 12:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus7002.magru.wmnet * 12:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS trixie * 12:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage * 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet * 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet * 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet * 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage * 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet * 12:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS trixie * 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage * 12:12 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet * 12:12 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus7002.magru.wmnet * 12:12 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus6002.drmrs.wmnet * 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage * 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS trixie * 12:09 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 12:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 12:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet * 12:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 12:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage * 12:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2005.codfw.wmnet * 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet * 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet * 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS trixie * 12:06 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus6002.drmrs.wmnet * 12:06 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2007.codfw.wmnet * 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS trixie * 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage * 12:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2005.codfw.wmnet * 12:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2006.codfw.wmnet * 12:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage * 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet * 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet * 11:57 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet * 11:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage * 11:56 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad * 11:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2006.codfw.wmnet * 11:56 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2007.codfw.wmnet * 11:56 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2005.codfw.wmnet * 11:53 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet * 11:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage * 11:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart * 11:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage * 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage * 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage * 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage * 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage * 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage * 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage * 11:46 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2005.codfw.wmnet * 11:46 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus5003.eqsin.wmnet * 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage * 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage * 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage * 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage * 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage * 11:42 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet * 11:39 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus5003.eqsin.wmnet * 11:39 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1008.eqiad.wmnet * 11:39 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[1003-1004].eqiad.wmnet with reason: restart * 11:37 moritzm: failover Ganeti cluster in eqsin to ganeti5004 * 11:37 moritzm: failover Ganeti cluster in magru to ganeti7001 * 11:36 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS trixie * 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS trixie * 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS trixie * 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS trixie * 11:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet * 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS trixie * 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS trixie * 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet * 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet * 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet * 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet * 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS trixie * 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS trixie * 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie * 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS trixie * 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS trixie * 11:29 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1008.eqiad.wmnet * 11:29 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet * 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet * 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet * 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS trixie * 11:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS trixie * 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet * 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet * 11:19 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet * 11:19 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus4003.ulsfo.wmnet * 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet * 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet * 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet * 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet * 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS trixie * 11:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS trixie * 11:13 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus4003.ulsfo.wmnet * 11:13 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus3004.esams.wmnet * 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage * 11:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0) * 11:10 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication * 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet * 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet * 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS trixie * 11:07 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus3004.esams.wmnet * 11:07 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1007.eqiad.wmnet * 11:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage * 11:06 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad * 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet * 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet * 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS trixie * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet * 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage * 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS trixie * 11:00 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet * 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage * 10:59 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1007.eqiad.wmnet * 10:59 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1005.eqiad.wmnet * 10:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS trixie * 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage * 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS trixie * 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet * 10:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet * 10:53 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet * 10:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet * 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS trixie * 10:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet * 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage * 10:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1005.eqiad.wmnet * 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad) * 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1005.eqiad.wmnet * 10:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1005.eqiad.wmnet * 10:49 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet * 10:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS trixie * 10:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage * 10:46 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1005.eqiad.wmnet * 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1005.eqiad.wmnet * 10:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1004.eqiad.wmnet * 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1004.eqiad.wmnet * 10:45 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw * 10:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1005.eqiad.wmnet * 10:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet * 10:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage * 10:41 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1004.eqiad.wmnet * 10:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage * 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1004.eqiad.wmnet * 10:40 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1003.eqiad.wmnet * 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1003.eqiad.wmnet * 10:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet * 10:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1006.eqiad.wmnet * 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage * 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage * 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage * 10:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1 * 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1003.eqiad.wmnet * 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage * 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage * 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage * 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1003.eqiad.wmnet * 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1002.eqiad.wmnet * 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1002.eqiad.wmnet * 10:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage * 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage * 10:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage * 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage * 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage * 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage * 10:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1002.eqiad.wmnet * 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1006.eqiad.wmnet * 10:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet * 10:29 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage * 10:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1002.eqiad.wmnet * 10:26 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad) * 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet * 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS trixie * 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS trixie * 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS trixie * 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS trixie * 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS trixie * 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS trixie * 10:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS trixie * 10:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS trixie * 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS trixie * 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS trixie * 10:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet * 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS trixie * 10:10 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet * 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2244: Migration of db2244.codfw.wmnet completed * 10:07 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet * 10:05 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet * 10:00 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet * 10:00 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet * 09:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet * 09:58 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet * 09:55 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw * 09:53 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet * 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet * 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet * 09:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet * 09:47 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' . * 09:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet * 09:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet * 09:42 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet * 09:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 40401 * 09:38 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 40401 * 09:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2001.codfw.wmnet * 09:34 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet * 09:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1006.eqiad.wmnet * 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-eqiad * 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1009.eqiad.wmnet * 09:31 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1009.eqiad.wmnet * 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet * 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet * 09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2001.codfw.wmnet * 09:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1009.eqiad.wmnet * 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1009.eqiad.wmnet * 09:25 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1008.eqiad.wmnet * 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1008.eqiad.wmnet * 09:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet * 09:23 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet * 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2244: Migration of db2244.codfw.wmnet completed * 09:20 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1005.eqiad.wmnet * 09:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2244.codfw.wmnet with OS trixie * 09:20 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1008.eqiad.wmnet * 09:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet * 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1008.eqiad.wmnet * 09:19 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1007.eqiad.wmnet * 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1007.eqiad.wmnet * 09:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply * 09:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply * 09:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply * 09:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply * 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1007.eqiad.wmnet * 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1007.eqiad.wmnet * 09:13 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1006.eqiad.wmnet * 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1006.eqiad.wmnet * 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet * 09:13 filippo@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host cloudnet1006.eqiad.wmnet * 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet * 09:08 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1006.eqiad.wmnet * 09:07 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1006.eqiad.wmnet * 09:05 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet * 09:05 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet * 09:04 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw * 09:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2244.codfw.wmnet with reason: host reimage * 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org * 09:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2244.codfw.wmnet with reason: host reimage * 08:59 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet * 08:58 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet * 08:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org * 08:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet * 08:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet * 08:55 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] (duration: 07m 15s) * 08:52 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet * 08:51 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment * 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org * 08:50 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 08:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet * 08:49 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet * 08:49 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-eqiad * 08:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1072.eqiad.wmnet * 08:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] * 08:47 moritzm: failover Ganeti cluster in ulsfo to ganeti4008 * 08:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org * 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet * 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet * 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet * 08:44 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] (duration: 09m 08s) * 08:44 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw * 08:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2244.codfw.wmnet with OS trixie * 08:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0) * 08:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1072.eqiad.wmnet * 08:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1071.eqiad.wmnet * 08:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet * 08:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet * 08:41 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet * 08:40 kharlan@deploy1003: kharlan: Continuing with deployment * 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2244: Upgrading db2244.codfw.wmnet * 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2244: Upgrading db2244.codfw.wmnet * 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet * 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade * 08:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1071.eqiad.wmnet * 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet * 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1070.eqiad.wmnet * 08:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet * 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad * 08:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet * 08:35 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet * 08:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] * 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4007.ulsfo.wmnet * 08:35 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet * 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet * 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet * 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet * 08:33 cezmunsta: Removing db2151 from orchestrator [[phab:T424343|T424343]] * 08:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet * 08:32 cezmunsta: Removing db2151 from zarcillo [[phab:T424343|T424343]] * 08:32 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1070.eqiad.wmnet * 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2151.codfw.wmnet * 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 08:31 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 08:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet * 08:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet * 08:28 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet * 08:27 cwilliams@cumin1003: START - Cookbook sre.dns.netbox * 08:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet * 08:26 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet * 08:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster * 08:24 Emperor: reboot apus codfw frontends (May reboots) * 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4008.ulsfo.wmnet * 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet * 08:22 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2151.codfw.wmnet * 08:19 hashar@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] * 08:19 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet * 08:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0) * 08:17 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad * 08:17 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet * 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet * 08:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4008.ulsfo.wmnet * 08:10 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet * 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Repooling after switchover * 07:58 cezmunsta: Removing db2150 from orchestrator [[phab:T424342|T424342]] * 07:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster * 07:57 Emperor: reboot apus eqiad frontends (May reboots) * 07:52 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet * 07:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org * 07:50 cezmunsta: Removing db2150 from zarcillo [[phab:T424342|T424342]] * 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2150.codfw.wmnet * 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 07:48 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 07:46 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet * 07:45 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet * 07:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org * 07:43 cwilliams@cumin1003: START - Cookbook sre.dns.netbox * 07:39 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet * 07:39 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2150.codfw.wmnet * 07:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bookworm * 07:33 XioNoX: add gnmic 0.46.0 to reprepro * 07:20 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] (duration: 13m 17s) * 07:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage * 07:17 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Repooling after switchover * 07:14 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage * 07:14 mlitn@deploy1003: mlitn: Continuing with deployment * 07:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1210: Repooling after switchover * 07:13 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet * 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet * 07:07 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] * 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet * 07:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bookworm * 07:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet * 07:02 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm * 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet * 06:59 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bookworm * 06:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2212.codfw.wmnet with reason: Maintenance * 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet * 06:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db2212 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92584 and previous config saved to /var/cache/conftool/dbconfig/20260519-065637-fceratto.json * 06:54 moritzm: installing qemu security updates * 06:52 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2203 to s1 primary [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92583 and previous config saved to /var/cache/conftool/dbconfig/20260519-065224-fceratto.json * 06:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1 * 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011.eqiad.wmnet: Maintenance on pc1 * 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0) * 06:51 federico3: Starting s1 codfw failover from db2212 to db2203 - [[phab:T426703|T426703]] * 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1011.eqiad.wmnet: Maintenance on pc1 * 06:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99) * 06:50 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 06:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage * 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet * 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2203 with weight 0 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92581 and previous config saved to /var/cache/conftool/dbconfig/20260519-064500-fceratto.json * 06:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426703|T426703]] * 06:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage * 06:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet * 06:40 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage * 06:39 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage * 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2014.codfw.wmnet * 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 06:33 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 06:29 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bookworm * 06:28 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1210: Repooling after switchover * 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bookworm * 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2014.codfw.wmnet * 06:22 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2014 from dbctl [[phab:T426595|T426595]]', diff saved to https://phabricator.wikimedia.org/P92578 and previous config saved to /var/cache/conftool/dbconfig/20260519-062227-marostegui.json * 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance * 06:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1210 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92577 and previous config saved to /var/cache/conftool/dbconfig/20260519-062056-fceratto.json * 06:19 fceratto@dns1005: END - running authdns-update * 06:18 fceratto@dns1005: START - running authdns-update * 06:15 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1230 to s5 primary and set section read-write [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92576 and previous config saved to /var/cache/conftool/dbconfig/20260519-061524-fceratto.json * 06:14 fceratto@cumin1003: dbctl commit (dc=all): 'Set s5 eqiad as read-only for maintenance - [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92575 and previous config saved to /var/cache/conftool/dbconfig/20260519-061435-fceratto.json * 06:14 federico3: Starting s5 eqiad failover from db1210 to db1230 - [[phab:T426087|T426087]] * 06:09 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1230 with weight 0 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92574 and previous config saved to /var/cache/conftool/dbconfig/20260519-060929-fceratto.json * 06:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426087|T426087]] * 05:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2014.codfw.wmnet with reason: Maintenance on pc4 * 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.26 (duration: 02m 40s) * 03:41 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] (duration: 38m 23s) * 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] * 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s) * 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 02:00 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 02:00 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 02:00 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 02:00 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 02:00 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 02:00 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 00:58 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] (duration: 06m 36s) * 00:54 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 00:54 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] * 00:32 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org * 00:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] (duration: 07m 08s) * 00:26 ladsgroup@deploy1003: ladsgroup, jforrester: Continuing with deployment * 00:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org * 00:25 ladsgroup@deploy1003: ladsgroup, jforrester: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2002.wikimedia.org with reason: [[phab:T426563|T426563]] * 00:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] == 2026-05-18 == * 23:48 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet * 23:44 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet * 23:44 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on vrts1004.eqiad.wmnet with reason: [[phab:T426563|T426563]] * 23:37 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on lists2001.wikimedia.org with reason: [[phab:T426563|T426563]] * 23:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bookworm * 23:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bookworm * 23:18 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] (duration: 06m 52s) * 23:13 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 23:12 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:11 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] * 23:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage * 23:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage * 23:00 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage * 22:59 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage * 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bookworm * 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bookworm * 22:11 swfrench@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker2331.codfw.wmnet * 22:06 swfrench@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker2331.codfw.wmnet * 21:55 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2149-2154].codfw.wmnet * 21:54 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2154].codfw.wmnet * 21:45 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T426563|T426563]] * 21:44 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T426563|T426563]] * 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T426563|T426563]] * 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T426563|T426563]] * 21:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] (duration: 11m 29s) * 21:38 krinkle@deploy1003: seddon, krinkle: Continuing with deployment * 21:32 krinkle@deploy1003: seddon, krinkle: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:31 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] * 21:28 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2190.codfw.wmnet * 21:28 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2190.codfw.wmnet * 21:16 mutante: gerrit-replica.wikimedia.org back online * 21:14 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:15:00 on gerrit-replica.wikimedia.org with reason: [[phab:T426563|T426563]] * 21:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit2002.wikimedia.org with reason: [[phab:T426563|T426563]] * 21:10 mutante: gerrit-replica.wikimedia.org, gerrit-spare.wikimedia.org - rebooting backends * 21:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit1003.wikimedia.org with reason: [[phab:T426563|T426563]] * 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2317-2330].codfw.wmnet * 20:30 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2317-2330].codfw.wmnet * 20:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2317-2330].codfw.wmnet * 20:14 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2317-2330].codfw.wmnet * 20:13 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2303-2316].codfw.wmnet * 20:13 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2303-2316].codfw.wmnet * 20:05 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2303-2316].codfw.wmnet * 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2303-2316].codfw.wmnet * 19:56 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2289-2302].codfw.wmnet * 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2289-2302].codfw.wmnet * 19:48 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2289-2302].codfw.wmnet * 19:40 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2289-2302].codfw.wmnet * 19:39 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2275-2288].codfw.wmnet * 19:39 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2275-2288].codfw.wmnet * 19:29 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2275-2288].codfw.wmnet * 19:21 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2275-2288].codfw.wmnet * 19:20 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2261-2274].codfw.wmnet * 19:20 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2261-2274].codfw.wmnet * 19:12 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2261-2274].codfw.wmnet * 19:03 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2261-2274].codfw.wmnet * 19:02 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2243,2248-2260].codfw.wmnet * 19:02 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2243,2248-2260].codfw.wmnet * 18:54 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2243,2248-2260].codfw.wmnet * 18:48 jhathaway@dns1004: END - running authdns-update * 18:46 jhathaway@dns1004: START - running authdns-update * 18:45 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2243,2248-2260].codfw.wmnet * 18:45 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2203-2215,2242].codfw.wmnet * 18:44 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2203-2215,2242].codfw.wmnet * 18:38 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2003.wikimedia.org with reason: [[phab:T426563|T426563]] * 18:35 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2203-2215,2242].codfw.wmnet * 18:32 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet * 18:32 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet * 18:31 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet * 18:31 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet * 18:30 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host centrallog1002.eqiad.wmnet * 18:30 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog1002.eqiad.wmnet * 18:26 herron: rebooting alert1002 * 18:26 swfrench@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 18:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2191-2202].codfw.wmnet * 18:22 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2191-2202].codfw.wmnet * 18:20 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2003.codfw.wmnet * 18:20 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2002.codfw.wmnet * 18:19 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2189.codfw.wmnet * 18:19 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2189.codfw.wmnet * 18:16 mutante: releases.wikimedia.org - rebooting backends * 18:16 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on releases2003.codfw.wmnet with reason: [[phab:T426563|T426563]] * 18:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet * 18:13 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2002.codfw.wmnet * 18:13 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1004.eqiad.wmnet * 18:11 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet * 18:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet * 18:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet * 18:06 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1004.eqiad.wmnet * 18:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1003.eqiad.wmnet * 18:04 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 18:02 Reedy: Deployed patch for [[phab:T426631|T426631]] * 17:59 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1003.eqiad.wmnet * 17:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1002.eqiad.wmnet * 17:56 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad * 17:50 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host centrallog2002.codfw.wmnet * 17:50 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1002.eqiad.wmnet * 17:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bookworm * 17:46 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab1005.eqiad.wmnet with reason: [[phab:T426563|T426563]] * 17:46 herron: rebooting alert2002 * 17:45 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab2003.codfw.wmnet with reason: [[phab:T426563|T426563]] * 17:45 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org * 17:45 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org * 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana1002.eqiad.wmnet * 17:44 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org * 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org * 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog2002.codfw.wmnet * 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bookworm * 17:40 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana1002.eqiad.wmnet * 17:38 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1005.eqiad.wmnet * 17:37 mutante: stewards* - rebooting * 17:36 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet * 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage * 17:32 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet * 17:31 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite1005.eqiad.wmnet * 17:30 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite2004.codfw.wmnet * 17:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage * 17:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog1003.eqiad.wmnet * 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage * 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage * 17:23 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite2004.codfw.wmnet * 17:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp1001.eqiad.wmnet * 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet * 17:18 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog1003.eqiad.wmnet * 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp1001.eqiad.wmnet * 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet * 17:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp2001.codfw.wmnet * 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog2003.codfw.wmnet * 17:14 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc2003.codfw.wmnet with reason: [[phab:T426563|T426563]] * 17:14 mutante: doc.wikimedia.org - rebooting backends * 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2003.codfw.wmnet * 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest2001.codfw.wmnet * 17:13 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc1004.eqiad.wmnet with reason: [[phab:T426563|T426563]] * 17:13 topranks: restarted gnmic on netflow3004 as series missing for cr2-esams * 17:12 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest1001.eqiad.wmnet * 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bookworm * 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bookworm * 17:11 mutante: etherpad - rebooting backends * 17:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on etherpad1004.eqiad.wmnet with reason: [[phab:T426563|T426563]] * 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp2001.codfw.wmnet * 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf2003.codfw.wmnet * 17:08 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog2003.codfw.wmnet * 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest2001.codfw.wmnet * 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest1001.eqiad.wmnet * 17:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad * 17:04 mutante: contint2002, phab2002 - rebooting * 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1373-1374].eqiad.wmnet * 16:49 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1373-1374].eqiad.wmnet * 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1373-1374].eqiad.wmnet * 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1373-1374].eqiad.wmnet * 16:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1370-1372].eqiad.wmnet * 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1370-1372].eqiad.wmnet * 16:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2189-2202].codfw.wmnet * 16:37 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw * 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1370-1372].eqiad.wmnet * 16:32 mutante: zuul[12]00[123] / zuul* - rebooting * 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2189-2202].codfw.wmnet * 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet * 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet * 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1370-1372].eqiad.wmnet * 16:28 mutante: contint2003 - new jenkins - reboot for kernel upgrade * 16:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1367-1369].eqiad.wmnet * 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1367-1369].eqiad.wmnet * 16:27 mutante: people.wikimedia.org backend - rebooting * 16:22 mutante: contint1003 - rebooting * 16:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet * 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet * 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1367-1369].eqiad.wmnet * 16:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1367-1369].eqiad.wmnet * 16:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1364-1366].eqiad.wmnet * 16:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1364-1366].eqiad.wmnet * 16:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet * 16:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet * 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet * 16:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2155-2169].codfw.wmnet * 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2155-2169].codfw.wmnet * 16:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bookworm * 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1364-1366].eqiad.wmnet * 16:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bookworm * 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1364-1366].eqiad.wmnet * 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1361-1363].eqiad.wmnet * 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1361-1363].eqiad.wmnet * 16:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet * 16:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet * 16:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2155-2169].codfw.wmnet * 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1361-1363].eqiad.wmnet * 16:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1361-1363].eqiad.wmnet * 16:00 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1358-1360].eqiad.wmnet * 16:00 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1358-1360].eqiad.wmnet * 15:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage * 15:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet * 15:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet * 15:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2155-2169].codfw.wmnet * 15:54 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 15:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage * 15:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1358-1360].eqiad.wmnet * 15:53 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet * 15:52 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1358-1360].eqiad.wmnet * 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1355-1357].eqiad.wmnet * 15:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1355-1357].eqiad.wmnet * 15:50 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw * 15:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2152-2154].codfw.wmnet * 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage * 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage * 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet * 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet * 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2152-2154].codfw.wmnet * 15:48 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[2149-2151].codfw.wmnet * 15:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2151].codfw.wmnet * 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet * 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1355-1357].eqiad.wmnet * 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1355-1357].eqiad.wmnet * 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1354].eqiad.wmnet * 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1354].eqiad.wmnet * 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet * 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet * 15:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2149-2151].codfw.wmnet * 15:38 Amir1: re-mapping thumbsize of 1 to 2 in all group0 wikis ([[phab:T376152|T376152]]) * 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2149-2151].codfw.wmnet * 15:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2146-2148].codfw.wmnet * 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2146-2148].codfw.wmnet * 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bookworm * 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bookworm * 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet * 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1352-1354].eqiad.wmnet * 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1352-1354].eqiad.wmnet * 15:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1349-1351].eqiad.wmnet * 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1349-1351].eqiad.wmnet * 15:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet * 15:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet * 15:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet * 15:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2146-2148].codfw.wmnet * 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2146-2148].codfw.wmnet * 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2143-2145].codfw.wmnet * 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2143-2145].codfw.wmnet * 15:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1349-1351].eqiad.wmnet * 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1349-1351].eqiad.wmnet * 15:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1346-1348].eqiad.wmnet * 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1346-1348].eqiad.wmnet * 15:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet * 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet * 15:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet * 15:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2143-2145].codfw.wmnet * 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2143-2145].codfw.wmnet * 15:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2142].codfw.wmnet * 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2142].codfw.wmnet * 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1346-1348].eqiad.wmnet * 15:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet * 15:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1346-1348].eqiad.wmnet * 15:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1343-1345].eqiad.wmnet * 15:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1343-1345].eqiad.wmnet * 15:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet * 15:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet * 15:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1343-1345].eqiad.wmnet * 15:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2142].codfw.wmnet * 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet * 15:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1343-1345].eqiad.wmnet * 15:11 herron@cumin1003: END (FAIL) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=99) rolling reboot on A:kafka-logging-codfw * 15:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2142].codfw.wmnet * 15:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1340-1342].eqiad.wmnet * 15:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1340-1342].eqiad.wmnet * 15:10 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2137-2139].codfw.wmnet * 15:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2137-2139].codfw.wmnet * 15:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet * 15:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet * 15:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet * 15:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1340-1342].eqiad.wmnet * 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1340-1342].eqiad.wmnet * 15:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1337-1339].eqiad.wmnet * 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1337-1339].eqiad.wmnet * 15:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2137-2139].codfw.wmnet * 15:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet * 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2137-2139].codfw.wmnet * 15:01 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2134-2136].codfw.wmnet * 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2134-2136].codfw.wmnet * 14:57 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1337-1339].eqiad.wmnet * 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1337-1339].eqiad.wmnet * 14:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1334-1336].eqiad.wmnet * 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1334-1336].eqiad.wmnet * 14:54 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2134-2136].codfw.wmnet * 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2134-2136].codfw.wmnet * 14:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2131-2133].codfw.wmnet * 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2131-2133].codfw.wmnet * 14:48 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1334-1336].eqiad.wmnet * 14:48 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw * 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1334-1336].eqiad.wmnet * 14:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 14:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1331-1333].eqiad.wmnet * 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1331-1333].eqiad.wmnet * 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2131-2133].codfw.wmnet * 14:43 Daimona: Running queries to fixup data for [[phab:T426002|T426002]] * 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2131-2133].codfw.wmnet * 14:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2128-2130].codfw.wmnet * 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2128-2130].codfw.wmnet * 14:42 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe * 14:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1331-1333].eqiad.wmnet * 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1331-1333].eqiad.wmnet * 14:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1328-1330].eqiad.wmnet * 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1328-1330].eqiad.wmnet * 14:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2128-2130].codfw.wmnet * 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2128-2130].codfw.wmnet * 14:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2125-2127].codfw.wmnet * 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2125-2127].codfw.wmnet * 14:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1328-1330].eqiad.wmnet * 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2192 weight', diff saved to https://phabricator.wikimedia.org/P92573 and previous config saved to /var/cache/conftool/dbconfig/20260518-143320-fceratto.json * 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2149.codfw.wmnet with reason: Depooled host, will be decommissioned * 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2143.codfw.wmnet with reason: Depooled host, will be decommissioned * 14:31 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1328-1330].eqiad.wmnet * 14:31 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad) * 14:30 bking@deploy1003: Finished deploy [wdqs/wdqs@e8fb00c]: 0.3.163 (duration: 31m 47s) * 14:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bookworm * 14:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2125-2127].codfw.wmnet * 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2125-2127].codfw.wmnet * 14:25 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2114-2115,2124].codfw.wmnet * 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2114-2115,2124].codfw.wmnet * 14:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bookworm * 14:24 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 14:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 14:22 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] (duration: 20m 05s) * 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 14:18 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2114-2115,2124].codfw.wmnet * 14:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2192: Repooling after switchover * 14:18 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2192: Repooling after switchover * 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2114-2115,2124].codfw.wmnet * 14:16 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe * 14:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2111-2113].codfw.wmnet * 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2111-2113].codfw.wmnet * 14:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2179: Repooling after switchover * 14:16 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2179: Repooling after switchover * 14:13 mlitn@deploy1003: Rolling back deployment * 14:13 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage * 14:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 14:10 Amir1: mapping thumbsize of 0 to 2 in all group1 wikis ([[phab:T376152|T376152]]) * 14:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2111-2113].codfw.wmnet * 14:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage * 14:08 mlitn@deploy1003: mlitn: Continuing with deployment * 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage * 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage * 14:05 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2111-2113].codfw.wmnet * 14:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2108-2110].codfw.wmnet * 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2108-2110].codfw.wmnet * 14:02 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] * 13:58 bking@deploy1003: Started deploy [wdqs/wdqs@e8fb00c]: 0.3.163 * 13:56 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] (duration: 09m 57s) * 13:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2108-2110].codfw.wmnet * 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm * 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bookworm * 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2108-2110].codfw.wmnet * 13:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2105-2107].codfw.wmnet * 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2105-2107].codfw.wmnet * 13:51 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Continuing with deployment * 13:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe * 13:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bookworm * 13:47 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:46 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2105-2107].codfw.wmnet * 13:46 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] * 13:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bookworm * 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2105-2107].codfw.wmnet * 13:44 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2102-2104].codfw.wmnet * 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2102-2104].codfw.wmnet * 13:44 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad] (duration: 01m 55s) * 13:42 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad] * 13:42 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 13:41 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad] (duration: 05m 05s) * 13:41 Lucas_WMDE: updateCollation arwikisource for [[phab:T426526|T426526]] finished * 13:41 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 13:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2102-2104].codfw.wmnet * 13:36 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad] * 13:36 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad] (duration: 01m 54s) * 13:34 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad] * 13:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2218'] * 13:33 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2218'] * 13:33 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: updateCollation arwikisource --previous-collation=uppercase # [[phab:T426526|T426526]] * 13:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage * 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] (duration: 11m 24s) * 13:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2102-2104].codfw.wmnet * 13:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2093-2095].codfw.wmnet * 13:30 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2093-2095].codfw.wmnet * 13:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage * 13:28 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Continuing with deployment * 13:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage * 13:25 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage * 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5003.eqsin.wmnet * 13:23 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2093-2095].codfw.wmnet * 13:23 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2093-2095].codfw.wmnet * 13:21 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] * 13:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2090-2092].codfw.wmnet * 13:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5003.eqsin.wmnet * 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2090-2092].codfw.wmnet * 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 13:19 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] (duration: 09m 18s) * 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 13:16 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm * 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Continuing with deployment * 13:14 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2090-2092].codfw.wmnet * 13:13 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bookworm * 13:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2090-2092].codfw.wmnet * 13:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2087-2089].codfw.wmnet * 13:12 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2087-2089].codfw.wmnet * 13:11 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] * 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet * 13:07 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] (duration: 11m 27s) * 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2087-2089].codfw.wmnet * 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2087-2089].codfw.wmnet * 13:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078].codfw.wmnet * 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078].codfw.wmnet * 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet * 13:00 kharlan@deploy1003: kharlan: Continuing with deployment * 12:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078].codfw.wmnet * 12:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet * 12:56 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] * 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet * 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078].codfw.wmnet * 12:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2073-2075].codfw.wmnet * 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92570 and previous config saved to /var/cache/conftool/dbconfig/20260518-125038-fceratto.json * 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2073-2075].codfw.wmnet * 12:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2073-2075].codfw.wmnet * 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2073-2075].codfw.wmnet * 12:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2070-2072].codfw.wmnet * 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2070-2072].codfw.wmnet * 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast7002.wikimedia.org * 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92569 and previous config saved to /var/cache/conftool/dbconfig/20260518-124030-fceratto.json * 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast7002.wikimedia.org * 12:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2070-2072].codfw.wmnet * 12:32 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2070-2072].codfw.wmnet * 12:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2067-2069].codfw.wmnet * 12:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2067-2069].codfw.wmnet * 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet * 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92568 and previous config saved to /var/cache/conftool/dbconfig/20260518-123022-fceratto.json * 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet * 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet * 12:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2067-2069].codfw.wmnet * 12:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bookworm * 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet * 12:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet * 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2067-2069].codfw.wmnet * 12:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2062,2064-2065].codfw.wmnet * 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2062,2064-2065].codfw.wmnet * 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92567 and previous config saved to /var/cache/conftool/dbconfig/20260518-122014-fceratto.json * 12:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bookworm * 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet * 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bookworm * 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet * 12:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2062,2064-2065].codfw.wmnet * 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2062,2064-2065].codfw.wmnet * 12:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2059-2061].codfw.wmnet * 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2059-2061].codfw.wmnet * 12:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bookworm * 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet * 12:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage * 12:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2059-2061].codfw.wmnet * 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2059-2061].codfw.wmnet * 12:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2056-2058].codfw.wmnet * 12:02 Dreamy_Jazz: Ran `scap remove-patch --message-body 'Dropping patch already made public' /srv/patches/next/extensions/ConfirmEdit/01-[[phab:T423840|T423840]].patch` * 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2056-2058].codfw.wmnet * 12:02 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage * 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet * 11:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage * 11:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2056-2058].codfw.wmnet * 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage * 11:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet * 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2056-2058].codfw.wmnet * 11:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2050-2051,2055].codfw.wmnet * 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2050-2051,2055].codfw.wmnet * 11:53 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage * 11:53 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2213 to s5 primary [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92566 and previous config saved to /var/cache/conftool/dbconfig/20260518-115304-fceratto.json * 11:52 federico3: Starting s5 codfw failover from db2192 to db2213 - [[phab:T426600|T426600]] * 11:51 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage * 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage * 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage * 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet * 11:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2050-2051,2055].codfw.wmnet * 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2213 with weight 0 [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92565 and previous config saved to /var/cache/conftool/dbconfig/20260518-114652-fceratto.json * 11:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426600|T426600]] * 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2050-2051,2055].codfw.wmnet * 11:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2044,2046,2049].codfw.wmnet * 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2044,2046,2049].codfw.wmnet * 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet * 11:41 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bookworm * 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm * 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bookworm * 11:38 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bookworm * 11:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2044,2046,2049].codfw.wmnet * 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2044,2046,2049].codfw.wmnet * 11:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2039,2041-2042].codfw.wmnet * 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2039,2041-2042].codfw.wmnet * 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bullseye * 11:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bullseye * 11:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2039,2041-2042].codfw.wmnet * 11:27 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm1001.wikimedia.org * 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2039,2041-2042].codfw.wmnet * 11:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2036-2038].codfw.wmnet * 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2036-2038].codfw.wmnet * 11:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bullseye * 11:24 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm1001.wikimedia.org * 11:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet * 11:21 slyngshede@dns1004: END - running authdns-update * 11:19 slyngshede@dns1004: START - running authdns-update * 11:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2036-2038].codfw.wmnet * 11:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet * 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4006.wikimedia.org * 11:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage * 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2036-2038].codfw.wmnet * 11:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2033-2035].codfw.wmnet * 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2033-2035].codfw.wmnet * 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast4006.wikimedia.org * 11:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage * 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm2001.wikimedia.org * 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org * 11:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2033-2035].codfw.wmnet * 11:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage * 11:09 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm2001.wikimedia.org * 11:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm-test1001.wikimedia.org * 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2033-2035].codfw.wmnet * 11:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2016-2018].codfw.wmnet * 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2016-2018].codfw.wmnet * 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage * 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage * 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage * 11:05 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm-test1001.wikimedia.org * 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org * 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp1005.wikimedia.org * 11:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2016-2018].codfw.wmnet * 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2016-2018].codfw.wmnet * 11:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2013-2015].codfw.wmnet * 11:00 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp1005.wikimedia.org * 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2013-2015].codfw.wmnet * 10:56 slyngshede@dns1004: END - running authdns-update * 10:55 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bullseye * 10:54 slyngshede@dns1004: START - running authdns-update * 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bullseye * 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bullseye * 10:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2013-2015].codfw.wmnet * 10:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2013-2015].codfw.wmnet * 10:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2006,2011-2012].codfw.wmnet * 10:51 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2006,2011-2012].codfw.wmnet * 10:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe * 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp2005.wikimedia.org * 10:46 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp2005.wikimedia.org * 10:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org * 10:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2006,2011-2012].codfw.wmnet * 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2006,2011-2012].codfw.wmnet * 10:42 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2001-2002,2005].codfw.wmnet * 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2001-2002,2005].codfw.wmnet * 10:41 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org * 10:41 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2005.wikimedia.org * 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org * 10:37 slyngshede@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host idp-test2005.wikimedia.org * 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org * 10:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2001-2002,2005].codfw.wmnet * 10:33 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2001-2002,2005].codfw.wmnet * 10:33 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet * 10:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet * 10:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet * 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4', diff saved to https://phabricator.wikimedia.org/P92564 and previous config saved to /var/cache/conftool/dbconfig/20260518-101917-marostegui.json * 10:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2024: replacing hw * 10:18 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99) * 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2024: replacing hw * 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet * 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to pc4 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92563 and previous config saved to /var/cache/conftool/dbconfig/20260518-101749-marostegui.json * 10:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet * 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet * 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92562 and previous config saved to /var/cache/conftool/dbconfig/20260518-101714-marostegui.json * 10:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet * 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply * 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply * 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92561 and previous config saved to /var/cache/conftool/dbconfig/20260518-100831-fceratto.json * 10:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance * 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet * 10:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet * 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet * 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92560 and previous config saved to /var/cache/conftool/dbconfig/20260518-100710-fceratto.json * 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2240 to s4 primary [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92559 and previous config saved to /var/cache/conftool/dbconfig/20260518-100203-fceratto.json * 10:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet * 10:00 federico3: Starting s4 codfw failover from db2179 to db2240 - [[phab:T426590|T426590]] * 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1086.eqiad.wmnet with OS bullseye * 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet * 09:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet * 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet * 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2240 with weight 0 [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92558 and previous config saved to /var/cache/conftool/dbconfig/20260518-095218-fceratto.json * 09:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426590|T426590]] * 09:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet * 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet * 09:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet * 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet * 09:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1082.eqiad.wmnet with OS bullseye * 09:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage * 09:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet * 09:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet * 09:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet * 09:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw) * 09:37 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage * 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet * 09:29 cezmunsta: Removing db2152.codfw.wmnet from orchestrator [[phab:T424344|T424344]] * 09:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage * 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1086 * 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1086 * 09:25 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1086 * 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 09:25 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002" * 09:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage * 09:25 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002" * 09:21 mvernon@cumin2002: START - Cookbook sre.dns.netbox * 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1086 * 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1086.eqiad.wmnet with OS bullseye * 09:18 moritzm: installing Java 21 security updates * 09:13 cezmunsta: Removing db2152.codfw.wmnet from zarcillo [[phab:T424344|T424344]] * 09:11 jnuche@deploy1003: Installation of scap version "4.265.2" completed for 1 hosts * 09:10 jnuche@deploy1003: Installing scap version "4.265.2" for 1 host(s) * 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2152.codfw.wmnet * 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 09:08 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003" * 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1082 * 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1082 * 09:07 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1082 * 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 09:07 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002" * 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002" * 09:05 jnuche@deploy1003: Installing scap version "4.265.2" for 163 host(s) * 09:04 cwilliams@cumin1003: START - Cookbook sre.dns.netbox * 09:03 ayounsi@dns1004: END - running authdns-update * 09:02 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] (duration: 31m 35s) * 09:01 ayounsi@dns1004: START - running authdns-update * 08:59 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2152.codfw.wmnet * 08:50 javiermonton@deploy1003: javiermonton: Continuing with deployment * 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 08:50 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 08:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox * 08:41 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1082 * 08:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1082.eqiad.wmnet with OS bullseye * 08:31 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] * 08:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet * 08:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply * 08:14 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply * 08:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet * 08:12 moritzm: installing glibc bugfix updates from bookworm point release * 07:46 moritzm: installing systemd bugfix updates from bookworm point release * 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet * 07:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet * 07:35 moritzm: installing openssl bugfix updates from bookworm point release * 07:25 godog: clean up space on cloudcumin1001: apt archives and older kernels * 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2013.codfw.wmnet * 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 07:14 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 07:09 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 07:04 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2013.codfw.wmnet * 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2013 from dbctl [[phab:T426555|T426555]]', diff saved to https://phabricator.wikimedia.org/P92557 and previous config saved to /var/cache/conftool/dbconfig/20260518-070322-marostegui.json * 06:59 moritzm: installing systemd bugfix updates from trixie point release * 06:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Kgraessle out of all services on: 2468 hosts * 06:54 moritzm: installing Linux 6.1.172 on bookworm hosts * 06:49 moritzm: installing glibc bugfix updates from trixie point release * 06:44 moritzm: installing openssl bugfix updates from trixie point release * 06:33 moritzm: installing Linux 6.12.88 on trixie hosts * 05:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99) * 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 05:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0) * 05:11 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 05:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2014,2024].codfw.wmnet,pc1014.eqiad.wmnet with reason: Maintenance on pc4 == 2026-05-15 == * 21:03 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] (duration: 07m 43s) * 20:59 jforrester@deploy1003: jforrester, seddon: Continuing with deployment * 20:57 jforrester@deploy1003: jforrester, seddon: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:55 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] * 20:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS bookworm * 20:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:09 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 19:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage * 19:47 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage * 19:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm * 19:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:23 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290 * 19:21 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290 * 19:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox * 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 16:02 dancy@deploy1003: Installation of scap version "4.265.1" completed for 2 hosts * 16:00 dancy@deploy1003: Installing scap version "4.265.1" for 2 host(s) * 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003" * 12:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003" * 12:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2009.codfw.wmnet * 11:59 Emperor: depool / restart swift / repool on ms-fe2010 ms-fe2012 * 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe2009.codfw.wmnet * 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2065.codfw.wmnet with OS bullseye * 11:14 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 11:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]] * 11:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage * 10:59 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage * 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2064.codfw.wmnet with OS bullseye * 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:46 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie * 10:43 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 10:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2065 * 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2065 * 10:40 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2065 * 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 10:40 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002" * 10:40 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002" * 10:36 mvernon@cumin2002: START - Cookbook sre.dns.netbox * 10:36 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2065 * 10:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2065.codfw.wmnet with OS bullseye * 10:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage * 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:28 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage * 10:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage * 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 10:20 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage * 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003" * 10:12 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003" * 10:10 topranks: Migrate ulsfo cr<->cr traffic to use path via switches not direct link [[phab:T424611|T424611]] * 10:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 10:04 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye * 10:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 10:01 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]] * 10:00 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 09:56 topranks: Migrate cr3-ulsfo link to asw1-22-ulsfo to tagged interface [[phab:T424611|T424611]] * 09:49 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply * 09:48 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 09:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 09:32 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2064.codfw.wmnet with OS bullseye * 09:32 topranks: Migrate cr4-ulsfo link to asw1-23-ulsfo to tagged interface [[phab:T424611|T424611]] * 09:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply * 09:30 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply * 09:30 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2065 * 09:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply * 09:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on db2218.codfw.wmnet with reason: Host crashed [[phab:T426383|T426383]] * 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2064 * 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2064 * 09:06 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2064 * 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 09:06 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002" * 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002" * 09:03 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 09:02 mvernon@cumin2002: START - Cookbook sre.dns.netbox * 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2064 * 09:01 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye * 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2218 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92553 and previous config saved to /var/cache/conftool/dbconfig/20260515-090000-marostegui.json * 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2220 to s7 primary [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92552 and previous config saved to /var/cache/conftool/dbconfig/20260515-085836-marostegui.json * 08:56 marostegui: Starting s7 codfw failover from db2218 to db2220 - [[phab:T426380|T426380]] * 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426380|T426380]] * 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2220 with weight 0 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92551 and previous config saved to /var/cache/conftool/dbconfig/20260515-085420-marostegui.json * 08:41 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2065 * 08:41 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2064 * 08:28 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie * 08:17 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie * 08:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 08:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 08:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 08:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:55 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:55 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2064 * 07:54 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 07:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2010 * 07:39 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2010 * 07:10 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 02:31 vriley@cumin1003: START - Cookbook sre.dns.netbox * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 50s) * 02:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1290.eqiad.wmnet with OS bookworm * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm * 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS bookworm * 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage * 00:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage * 00:43 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:42 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm * 00:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:01 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED == 2026-05-14 == * 23:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:57 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290 * 23:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290 * 23:54 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:49 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:39 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:30 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:26 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:13 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:12 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289 * 23:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289 * 23:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 23:07 vriley@cumin1003: START - Cookbook sre.dns.netbox * 21:47 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] (duration: 07m 14s) * 21:43 egardner@deploy1003: egardner: Continuing with deployment * 21:41 egardner@deploy1003: egardner: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:40 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] * 21:33 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] (duration: 09m 15s) * 21:29 jdrewniak@deploy1003: jdrewniak: Continuing with deployment * 21:26 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:24 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] * 21:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] (duration: 06m 33s) * 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS bookworm * 21:15 vriley@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:12 dreamyjazz@deploy1003: dreamyjazz, seddon: Continuing with deployment * 21:11 dreamyjazz@deploy1003: dreamyjazz, seddon: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:10 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] * 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS bookworm * 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:55 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] (duration: 07m 03s) * 20:46 sbisson@deploy1003: sbisson: Continuing with deployment * 20:45 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] * 20:43 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . * 20:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage * 20:35 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage * 20:35 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] (duration: 10m 18s) * 20:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:31 cjming@deploy1003: cjming, neriah: Continuing with deployment * 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:29 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1289.eqiad.wmnet with OS bookworm * 20:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS bookworm * 20:27 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:25 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] * 20:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage * 20:19 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS bookworm * 20:19 jsn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] (duration: 07m 48s) * 20:18 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage * 20:14 jsn@deploy1003: kgraessle, jsn: Continuing with deployment * 20:13 jsn@deploy1003: kgraessle, jsn: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:11 jsn@deploy1003: Started scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] * 20:03 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . * 20:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS bookworm * 19:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS bookworm * 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 19:46 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 19:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage * 19:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1286 * 19:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1286 * 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003" * 19:26 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003" * 19:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage * 19:22 vriley@cumin1003: START - Cookbook sre.dns.netbox * 19:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS bookworm * 19:14 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 19:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS bookworm * 18:58 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 18:57 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 18:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 18:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 18:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage * 18:25 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage * 18:17 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 18:16 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 18:14 vriley@cumin1003: START - Cookbook sre.dns.netbox * 18:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm * 17:32 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 17:31 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 17:23 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]] * 17:17 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply * 17:17 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply * 17:16 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply * 17:16 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply * 17:16 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply * 17:15 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply * 17:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]] * 17:10 cmooney@dns2005: END - running authdns-update * 17:09 cmooney@dns2005: START - running authdns-update * 17:06 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]] * 16:58 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]] * 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 16:35 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie * 16:31 topranks: disable core router direct link at esams now that traffic is flowing via switches [[phab:T424611|T424611]] * 16:25 topranks: disable core router direct link at drmrs now that traffic is flowing via switches [[phab:T424611|T424611]] * 16:21 topranks: disable core router direct link at magru now that traffic is flowing via switches [[phab:T424611|T424611]] * 16:20 rzl@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply * 16:20 rzl@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply * 16:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply * 16:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply * 16:16 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 16:15 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 16:14 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS bookworm * 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 16:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 16:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply * 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003" * 16:07 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003" * 16:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply * 16:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply * 15:59 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply * 15:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:56 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290 * 15:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290 * 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003" * 15:54 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003" * 15:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage * 15:51 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003 * 15:50 vriley@cumin1003: START - Cookbook sre.dns.netbox * 15:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage * 15:49 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003 * 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1285.eqiad.wmnet with OS bookworm * 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 15:46 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie * 15:45 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289 * 15:41 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289 * 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003" * 15:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003" * 15:35 vriley@cumin1003: START - Cookbook sre.dns.netbox * 15:33 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS bookworm * 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS bookworm * 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 15:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 15:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage * 15:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 15:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage * 15:16 bearloga@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] (duration: 06m 20s) * 15:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage * 15:12 bearloga@deploy1003: bearloga: Continuing with deployment * 15:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:12 bearloga@deploy1003: bearloga: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:10 bearloga@deploy1003: Started scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] * 15:08 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage * 15:08 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1285.eqiad.wmnet with OS bookworm * 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage * 14:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92544 and previous config saved to /var/cache/conftool/dbconfig/20260514-145715-fceratto.json * 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1288 * 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS bookworm * 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:54 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage * 14:54 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1288 * 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003" * 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003" * 14:52 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS bookworm * 14:49 vriley@cumin1003: START - Cookbook sre.dns.netbox * 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92542 and previous config saved to /var/cache/conftool/dbconfig/20260514-144707-fceratto.json * 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:44 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage * 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003" * 14:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003" * 14:37 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host db1289 * 14:37 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289 * 14:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92541 and previous config saved to /var/cache/conftool/dbconfig/20260514-143659-fceratto.json * 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS bookworm * 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:34 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] (duration: 11m 14s) * 14:33 vriley@cumin1003: START - Cookbook sre.dns.netbox * 14:33 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage * 14:33 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1285 * 14:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1285 * 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003" * 14:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003" * 14:29 phuedx@deploy1003: phuedx: Continuing with deployment * 14:27 vriley@cumin1003: START - Cookbook sre.dns.netbox * 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92540 and previous config saved to /var/cache/conftool/dbconfig/20260514-142650-fceratto.json * 14:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 14:24 phuedx@deploy1003: phuedx: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS bookworm * 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:22 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] * 14:21 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1284 * 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92539 and previous config saved to /var/cache/conftool/dbconfig/20260514-141922-fceratto.json * 14:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance * 14:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage * 14:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS bookworm * 14:18 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2150 from dbctl [[phab:T424342|T424342]]', diff saved to https://phabricator.wikimedia.org/P92538 and previous config saved to /var/cache/conftool/dbconfig/20260514-141812-cwilliams.json * 14:17 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1284 * 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003" * 14:17 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003" * 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92537 and previous config saved to /var/cache/conftool/dbconfig/20260514-141644-fceratto.json * 14:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage * 14:14 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] (duration: 08m 00s) * 14:13 vriley@cumin1003: START - Cookbook sre.dns.netbox * 14:09 krinkle@deploy1003: krinkle, robertsky: Continuing with deployment * 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 14:08 krinkle@deploy1003: krinkle, robertsky: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS bookworm * 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92536 and previous config saved to /var/cache/conftool/dbconfig/20260514-140635-fceratto.json * 14:06 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] * 14:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie * 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage * 14:01 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2151 from dbctl [[phab:T424343|T424343]]', diff saved to https://phabricator.wikimedia.org/P92535 and previous config saved to /var/cache/conftool/dbconfig/20260514-140110-cwilliams.json * 14:00 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] (duration: 07m 09s) * 13:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS bookworm * 13:58 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage * 13:57 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 13:56 mfossati@deploy1003: mfossati: Continuing with deployment * 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92534 and previous config saved to /var/cache/conftool/dbconfig/20260514-135626-fceratto.json * 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie * 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 13:55 mfossati@deploy1003: mfossati: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 13:53 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] * 13:53 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2152.codfw.wmnet with reason: Depooled host, will be decommissioned * 13:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART * 13:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92533 and previous config saved to /var/cache/conftool/dbconfig/20260514-135315-fceratto.json * 13:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance * 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2151.codfw.wmnet with reason: Depooled host, will be decommissioned * 13:52 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2150.codfw.wmnet with reason: Depooled host, will be decommissioned * 13:49 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] (duration: 07m 03s) * 13:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage * 13:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART * 13:45 krinkle@deploy1003: krinkle: Continuing with deployment * 13:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage * 13:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART * 13:44 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS bookworm * 13:42 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] * 13:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] (duration: 12m 33s) * 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART * 13:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:37 krinkle@deploy1003: krinkle, annet: Continuing with deployment * 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2151: Host will be decommissioned * 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2151: Host will be decommissioned * 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Host will be decommissioned * 13:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Host will be decommissioned * 13:31 krinkle@deploy1003: krinkle, annet: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS bookworm * 13:29 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] * 13:22 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1283 * 13:19 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:18 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1283 * 13:16 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] (duration: 08m 10s) * 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003" * 13:15 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003" * 13:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:12 sbisson@deploy1003: sbisson: Continuing with deployment * 13:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:10 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:10 vriley@cumin1003: START - Cookbook sre.dns.netbox * 13:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Repooling after switchover * 13:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1282 * 13:08 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] * 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:08 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Repooling after switchover * 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92529 and previous config saved to /var/cache/conftool/dbconfig/20260514-130743-fceratto.json * 13:07 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1282 * 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003" * 13:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003" * 13:05 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2161 to s8 primary [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92528 and previous config saved to /var/cache/conftool/dbconfig/20260514-130213-fceratto.json * 13:01 vriley@cumin1003: START - Cookbook sre.dns.netbox * 13:01 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1281 * 13:00 federico3: Starting s8 codfw failover from db2165 to db2161 - [[phab:T426291|T426291]] * 13:00 kart_: Updated cxserver to 2026-05-14-123010-production ([[phab:T426174|T426174]], [[phab:T404298|T404298]]) * 12:59 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1281 * 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003" * 12:59 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003" * 12:58 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply * 12:57 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply * 12:56 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply * 12:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 12:55 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply * 12:55 vriley@cumin1003: START - Cookbook sre.dns.netbox * 12:54 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1280 * 12:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1280 * 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003" * 12:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003" * 12:50 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2161 with weight 0 [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92527 and previous config saved to /var/cache/conftool/dbconfig/20260514-125014-fceratto.json * 12:49 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1279 * 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s8 [[phab:T426291|T426291]] * 12:49 vriley@cumin1003: START - Cookbook sre.dns.netbox * 12:47 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1279 * 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003" * 12:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003" * 12:47 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply * 12:46 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply * 12:42 vriley@cumin1003: START - Cookbook sre.dns.netbox * 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003 * 12:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003 * 12:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 28458 * 12:27 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 28458 * 12:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 with pc2023 as codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92526 and previous config saved to /var/cache/conftool/dbconfig/20260514-122707-marostegui.json * 12:21 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 12:21 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 12:20 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92525 and previous config saved to /var/cache/conftool/dbconfig/20260514-121958-marostegui.json * 12:18 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92524 and previous config saved to /var/cache/conftool/dbconfig/20260514-121839-marostegui.json * 11:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply * 11:31 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply * 11:08 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply * 11:08 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply * 11:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 11:01 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: sync * 11:00 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: sync * 11:00 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply * 11:00 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply * 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply * 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply * 10:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bullseye * 10:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bullseye * 10:45 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2152 from dbctl [[phab:T424344|T424344]]', diff saved to https://phabricator.wikimedia.org/P92523 and previous config saved to /var/cache/conftool/dbconfig/20260514-104521-marostegui.json * 10:41 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'. * 10:40 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'. * 10:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage * 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply * 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply * 10:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage * 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage * 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage * 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bullseye * 10:14 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bullseye * 10:14 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 10:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2152: Host will be decommissioned * 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152: Host will be decommissioned * 09:54 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db2152.codfw.wmnet: Host will be decommissioned * 09:51 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply * 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply * 09:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152.codfw.wmnet: Host will be decommissioned * 09:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bullseye * 09:33 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bullseye * 09:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bullseye * 09:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bullseye * 09:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage * 09:20 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]] * 09:18 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage * 09:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage * 09:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage * 09:06 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage * 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage * 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage * 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage * 08:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 08:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bullseye * 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bullseye * 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bullseye * 08:54 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bullseye * 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2149 [[phab:T424341|T424341]]', diff saved to https://phabricator.wikimedia.org/P92520 and previous config saved to /var/cache/conftool/dbconfig/20260514-083916-marostegui.json * 08:08 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] * 07:01 kart_: Update cxserver to 2026-04-23-114216-production ([[phab:T423002|T423002]]) * 07:00 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply * 07:00 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply * 06:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2013,2023].codfw.wmnet,pc1013.eqiad.wmnet with reason: Maintenance on pc3 * 06:40 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply * 06:40 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply * 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2013: Replacing HW [[phab:T418973|T418973]] * 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0) * 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2013: Replacing HW [[phab:T418973|T418973]] * 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1158: after reimage to trixie * 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1158: after reimage to trixie * 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1158.eqiad.wmnet with OS trixie * 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage * 05:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage * 05:12 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1158.eqiad.wmnet with OS trixie * 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1158: Reimage to Trixie * 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1158: Reimage to Trixie * 05:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Reimage to Trixie * 05:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s7 master: reimage to Debian Trixie * 05:04 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 00:07 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 == 2026-05-13 == * 21:12 Amir1: remapping thumbsize of 0 to 2 in all group0 wikis ([[phab:T376152|T376152]]) * 21:06 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 20:55 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] (duration: 07m 48s) * 20:51 jdlrobson@deploy1003: ladsgroup, jdlrobson: Continuing with deployment * 20:49 jdlrobson@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:47 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] * 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:43 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] (duration: 07m 32s) * 20:42 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 20:38 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 20:37 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:35 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] * 20:33 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] (duration: 07m 26s) * 20:28 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 20:27 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:25 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] * 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:23 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 06s) * 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply * 20:19 ebernhardson@deploy1003: ebernhardson: Continuing with deployment * 20:18 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 20:16 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] * 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] (duration: 06m 47s) * 20:13 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 20:09 cjming@deploy1003: bpirkle, cjming: Continuing with deployment * 20:09 cjming@deploy1003: bpirkle, cjming: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:07 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] * 19:23 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply * 19:23 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply * 19:09 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply * 19:09 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply * 18:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply * 18:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply * 18:27 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply * 18:26 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply * 18:25 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply * 18:25 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply * 18:20 cmooney@dns2005: END - running authdns-update * 18:19 cmooney@dns2005: START - running authdns-update * 18:14 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply * 18:13 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply * 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003" * 18:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003" * 18:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 18:01 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply * 18:00 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply * 17:50 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply * 17:50 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: apply * 17:47 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply * 17:47 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply * 17:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 17:43 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply * 17:42 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply * 17:36 topranks: update OSPF config on magru core routers to shift traffic to switch links [[phab:T424611|T424611]] * 17:34 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply * 17:33 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply * 17:28 mutante: zuul1001 systemctl start zuul-scheduler ; /usr/bin/docker exec zuul-scheduler zuul-scheduler smart-reconfigure * 17:26 mutante: zuul1001 - stopping zuul-web; then manually running: /usr/sbin/usermod -u 923 zuul * 17:26 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply * 17:26 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply * 17:24 topranks: update OSPF config on esams core routers to shift traffic to switch links [[phab:T424611|T424611]] * 17:20 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply * 17:19 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply * 17:05 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet * 17:03 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet * 16:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet * 16:55 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet * 16:43 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 16:29 topranks: update OSPF config on drmrs core routers to shift traffic to switch links [[phab:T424611|T424611]] * 16:20 topranks: update OSPF config on eqsin core routers to shift traffic to switch links [[phab:T424611|T424611]] * 16:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 16:10 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 15:53 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:42 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:37 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:37 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7009.* * 15:36 fabfur: repooling cp7009 to test haproxy-awslc behavior ([[phab:T419825|T419825]]) * 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 15:27 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7009.* * 15:27 fabfur: depooling cp7009 to install haproxy-awslc ([[phab:T419825|T419825]]) * 15:18 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 15:16 cmooney@dns2005: END - running authdns-update * 15:15 cmooney@dns2005: START - running authdns-update * 15:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 15:04 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003" * 15:04 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003" * 15:01 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:57 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:53 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003" * 14:53 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003" * 14:51 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003 * 14:50 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 14:49 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki-root1002.eqiad.wmnet with OS trixie * 14:42 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] (duration: 07m 17s) * 14:37 kharlan@deploy1003: kharlan: Continuing with deployment * 14:36 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:34 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] * 14:33 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003 * 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002" * 14:33 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003 * 14:33 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002" * 14:28 pt1979@cumin2002: START - Cookbook sre.dns.netbox * 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:25 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage * 14:19 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] (duration: 06m 35s) * 14:17 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 14:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage * 14:16 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 14:15 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003 * 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 14:15 jforrester@deploy1003: jforrester: Continuing with deployment * 14:15 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 14:14 jforrester@deploy1003: jforrester: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:14 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 14:14 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 14:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] * 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 14:10 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 14:10 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 14:09 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 14:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * 14:08 Lucas_WMDE: UTC afternoon backport+config window done * 14:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * {{safesubst:SAL entry|1=14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAl}} * 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:03 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Continuing with deployment * 14:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7001.* * 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org * 14:02 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply * 14:01 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/mathoid: apply * 14:01 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/mathoid: apply * {{safesubst:SAL entry|1=14:01 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-AP}} * 14:01 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/mathoid: apply * 14:00 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/mathoid: apply * 14:00 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki-root1002.eqiad.wmnet with OS trixie * 13:59 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/mathoid: apply * 13:59 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003 * {{safesubst:SAL entry|1=13:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAll}} * 13:58 fabfur: repooling cp7001 to test haproxy-awslc behavior ([[phab:T419825|T419825]]) * 13:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org * 13:50 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] (duration: 07m 36s) * 13:49 jmm@cumin2002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002 * 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment * 13:44 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:42 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] * {{safesubst:SAL entry|1=13:40 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers t}} * 13:36 mfossati@deploy1003: jdlrobson, mfossati: Continuing with deployment * {{safesubst:SAL entry|1=13:29 mfossati@deploy1003: jdlrobson, mfossati: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers that d}} * 13:28 jmm@cumin2002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002 * 13:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * {{safesubst:SAL entry|1=13:27 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers th}} * 13:25 moritzm: installing openjdk-11 security updates * 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] (duration: 08m 18s) * 13:07 sbisson@deploy1003: sbisson: Continuing with deployment * 13:05 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:04 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=pki,name=codfw * 13:03 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] * 12:50 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] (duration: 06m 42s) * 12:46 mszwarc@deploy1003: mszwarc: Continuing with deployment * 12:45 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] * 12:41 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7001.* * 12:40 fabfur: depool cp7001 to test haproxy-awslc (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1286526) ([[phab:T419825|T419825]]) * 12:38 topranks: add ibgp peering between cr1-magru and cr2-magru over loopback IPs [[phab:T424611|T424611]] * 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) * 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Migration of db1236.eqiad.wmnet completed * 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet * 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet * 12:02 topranks: add ibgp peering between cr1-esams and cr2-esams over loopback IPs [[phab:T424611|T424611]] * 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003" * 11:57 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003" * 11:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2220: after reimage to trixie * 11:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 11:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Migration of db1236.eqiad.wmnet completed * 11:44 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'. * 11:43 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'. * 11:43 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1236.eqiad.wmnet with OS trixie * 11:40 topranks: delete old direct ibgp peering between cr1-drms and cr2-drmrs [[phab:T424611|T424611]] * 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 11:27 topranks: add ibgp peering between cr1-drms and cr2-drmrs over loopback IPs [[phab:T424611|T424611]] * 11:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage * 11:24 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 11:24 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 11:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage * 11:19 moritzm: installing Linux 6.1.170-3 on all Bookworm hosts * 11:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki2002.codfw.wmnet with OS trixie * 11:10 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2220: after reimage to trixie * 11:06 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1236.eqiad.wmnet with OS trixie * 11:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Upgrading db1236.eqiad.wmnet * 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Upgrading db1236.eqiad.wmnet * 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade * 10:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2220.codfw.wmnet with OS trixie * 10:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 10:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org * 10:52 moritzm: installing Linux 5.10.251-4 on all Bullseye hosts * 10:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage * 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org * 10:42 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage * 10:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply * 10:39 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply * 10:39 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply * 10:39 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply * 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2220.codfw.wmnet with reason: host reimage * 10:33 topranks: switch eqsin core router ibgp path to route via switches [[phab:T424611|T424611]] * 10:26 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2220.codfw.wmnet with reason: host reimage * 10:25 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki2002.codfw.wmnet with OS trixie * 10:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 10:22 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 10:21 elukey@cumin1003: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki2002.codfw.wmnet * 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply * 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply * 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply * 10:16 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 10:15 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply * 10:15 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 10:14 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 10:12 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 10:10 moritzm: installing Apache security updates on Bullseye * 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 10:06 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2220.codfw.wmnet with OS trixie * 10:05 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply * 10:05 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bullseye * 10:04 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply * 10:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2220: Reimage to Trixie * 10:02 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2220: Reimage to Trixie * 10:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Reimage to Trixie * 10:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply * 10:01 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply * 09:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2220 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92500 and previous config saved to /var/cache/conftool/dbconfig/20260513-095934-marostegui.json * 09:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2218 to s7 primary [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92499 and previous config saved to /var/cache/conftool/dbconfig/20260513-095814-marostegui.json * 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 09:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bullseye * 09:56 moritzm: installing distro-info-data updates from Bookworm point release * 09:54 marostegui: Starting s7 codfw failover from db2220 to db2218 - [[phab:T426142|T426142]] * 09:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426142|T426142]] * 09:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bullseye * 09:53 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2218 with weight 0 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92498 and previous config saved to /var/cache/conftool/dbconfig/20260513-095337-marostegui.json * 09:51 moritzm: installing ca-certificates update from Bookworm point release * 09:50 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bullseye * 09:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage * 09:45 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] (duration: 09m 01s) * 09:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage * 09:41 kharlan@deploy1003: kharlan: Continuing with deployment * 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage * 09:38 kharlan@deploy1003: kharlan: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 09:36 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] * 09:34 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage * 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage * 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage * 09:29 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage * 09:29 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage * 09:28 cmooney@dns2005: END - running authdns-update * 09:27 cmooney@dns2005: START - running authdns-update * 09:27 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]] * 09:25 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki2002.codfw.wmnet * 09:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 09:22 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: reimage * 09:21 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]] * 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bullseye * 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bullseye * 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bullseye * 09:17 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bullseye * 09:14 elukey@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=pki,name=codfw * 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003" * 09:10 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003" * 09:07 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 08:45 moritzm: installing dnsmasq security updates * 08:40 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply * 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003" * 08:38 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply * 08:38 cmooney@dns2005: END - running authdns-update * 08:37 cmooney@dns2005: START - running authdns-update * 08:36 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply * 08:35 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] * 08:32 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003" * 08:32 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply * 08:28 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 08:25 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply * 08:25 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply * 08:24 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] (duration: 09m 18s) * 08:20 kharlan@deploy1003: kharlan: Continuing with deployment * 08:16 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 08:14 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] * 08:11 moritzm: imported dnsmasq 2.92-1~wmf13u2 to trixie-wikimedia/main (backport of latest dnsmasq security fixes to our internal build) * 08:08 topranks: reconfigure link from cr4-ulsfo to asw1-22-ulsfo as 802.1q tagged [[phab:T424611|T424611]] * 07:56 moritzm: imported dnsmasq 2.92-1~wmf12u2 to bookworm-wikimedia/main (backport of latest dnsmasq security fixes to our internal build) * 07:47 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] (duration: 09m 09s) * 07:43 dcausse@deploy1003: atsuko, dcausse: Continuing with deployment * 07:40 dcausse@deploy1003: atsuko, dcausse: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:39 gkyziridis@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync * 07:39 gkyziridis@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync * 07:38 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] * 07:37 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync * 07:37 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync * 07:34 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 09m 32s) * 07:30 dcausse@deploy1003: dcausse, wmde-fisch: Continuing with deployment * 07:27 dcausse@deploy1003: dcausse, wmde-fisch: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:25 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] * 07:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply * 07:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply * 07:17 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply * 07:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply * 07:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: after reimage to trixie * 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: after reimage to trixie * 06:39 moritzm: installing Exim security updates on the hosts where Exim is used as a local mail relay * 06:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: after reimage to trixie * 06:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2218.codfw.wmnet with OS trixie * 06:26 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1253: after reimage to trixie * 06:22 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1253.eqiad.wmnet with OS trixie * 06:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2218.codfw.wmnet with reason: host reimage * 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage * 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2218.codfw.wmnet with reason: host reimage * 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage * 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1253.eqiad.wmnet with OS trixie * 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2218.codfw.wmnet with OS trixie * 05:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1253: Reimage to Trixie * 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2218: Reimage to Trixie * 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1253: Reimage to Trixie * 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Reimage to Trixie * 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2218: Reimage to Trixie * 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Reimage to Trixie * 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS bookworm * 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 04:20 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 04:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage * 03:57 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage * 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS bookworm * 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 03:42 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 03:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS bookworm * 03:28 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS bookworm * 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 03:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 03:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage * 03:17 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage * 03:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1278 * 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage * 03:08 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1278 * 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003" * 03:07 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003" * 03:04 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage * 03:03 vriley@cumin1003: START - Cookbook sre.dns.netbox * 03:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS bookworm * 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:49 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS bookworm * 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS bookworm * 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 02:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1277 * 02:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1277 * 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003" * 02:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003" * 02:21 vriley@cumin1003: START - Cookbook sre.dns.netbox * 02:19 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1274.eqiad.wmnet with OS bookworm * 02:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage * 02:16 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:15 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1276 * 02:13 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage * 02:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1276 * 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003" * 02:10 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003" * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s) * 02:07 vriley@cumin1003: START - Cookbook sre.dns.netbox * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:58 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS bookworm * 01:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 01:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] (duration: 06m 35s) * 01:28 zabe@deploy1003: zabe: Continuing with deployment * 01:27 zabe@deploy1003: zabe: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 01:27 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm * 01:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] * 01:18 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 01:14 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1275 * 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 01:12 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1275 * 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003" * 01:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003" * 01:08 vriley@cumin1003: START - Cookbook sre.dns.netbox * 00:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:58 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1274 * 00:57 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1274 * 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003" * 00:56 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003" * 00:52 vriley@cumin1003: START - Cookbook sre.dns.netbox * 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS bookworm * 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 00:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" == 2026-05-12 == * 23:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage * 23:48 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage * 23:46 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] (duration: 12m 45s) * 23:40 cscott@deploy1003: cscott: Continuing with deployment * 23:39 cscott@deploy1003: cscott: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:33 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] * 23:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] (duration: 33m 28s) * 23:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS bookworm * 22:53 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 22:49 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS bookworm * 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 22:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 22:32 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] * 22:21 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage * 22:21 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage * 22:18 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] (duration: 34m 01s) * 22:05 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 22:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 22:01 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:59 dwisehaupt@dns1004: END - running authdns-update * 21:57 dwisehaupt@dns1004: START - running authdns-update * 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS bookworm * 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:46 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:43 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] * 21:42 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:41 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1273 * 21:40 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS bookworm * 21:39 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1273 * 21:38 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] (duration: 11m 56s) * 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003" * 21:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003" * 21:32 vriley@cumin1003: START - Cookbook sre.dns.netbox * 21:31 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Continuing with deployment * 21:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage * 21:29 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003" * 21:27 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003" * 21:26 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] * 21:23 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 21:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage * 21:19 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] (duration: 14m 51s) * 21:15 cscott@deploy1003: cscott: Continuing with deployment * 21:15 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]] * 21:07 cscott@deploy1003: cscott: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Change * 21:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm * 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS bookworm * 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:05 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003" * 21:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003" * 21:05 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] * 21:03 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]] * 21:01 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003" * 20:54 topranks: migrate link from cr2-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]] * 20:51 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1271.eqiad.wmnet with OS bookworm * 20:50 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] (duration: 09m 03s) * 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage * 20:46 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003" * 20:46 samtar@deploy1003: samtar, dreamrimmer: Continuing with deployment * 20:44 topranks: migrate link from cr1-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]] * 20:43 samtar@deploy1003: samtar, dreamrimmer: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:42 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage * 20:41 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 20:41 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage * 20:41 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage * 20:41 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] * 20:35 topranks: migrate link from cr2-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]] * 20:26 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] (duration: 08m 27s) * 20:25 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm * 20:23 topranks: migrate link from cr1-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]] * 20:20 dbrant@deploy1003: dbrant: Continuing with deployment * 20:20 dbrant@deploy1003: dbrant: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:18 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] * 20:16 topranks: migrate link from cr2-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]] * 20:15 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] (duration: 11m 47s) * 20:11 alexsanford@deploy1003: alexsanford: Continuing with deployment * 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm * 20:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:05 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:05 topranks: migrate link from cr1-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]] * 20:03 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] * 20:00 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 19:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:54 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 19:52 topranks: migrate link from cr2-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]] * 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1017.eqiad.wmnet with OS bullseye * 19:43 topranks: migrate link from cr1-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]] * 19:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage * 19:34 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] (duration: 07m 07s) * 19:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage * 19:30 dancy@deploy1003: jforrester, dancy: Continuing with deployment * 19:30 dancy@deploy1003: jforrester, dancy: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 19:27 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] * 19:26 topranks: migrate link from cr2-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]] * 19:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 19:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 19:06 topranks: migrate link from cr1-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]] * 19:05 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 18:42 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 18:35 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 18:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 18:08 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 18:08 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye * 17:56 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 17:56 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] (duration: 16m 08s) * 17:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 17:53 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:53 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:53 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:53 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:52 otto@deploy1003: otto: Continuing with deployment * 17:52 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:52 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:51 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:51 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:50 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:48 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:48 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:48 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:48 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:46 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:42 otto@deploy1003: otto: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 17:40 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] * 17:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:38 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:38 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:38 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 17:37 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 17:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 17:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 17:36 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply * 17:35 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply * 16:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm * 16:25 moritzm: installing Exim security updates on lists/vrts hosts * 16:00 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 15:57 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] (duration: 07m 22s) * 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 15:48 ladsgroup@deploy1003: ladsgroup, neriah: Continuing with deployment * 15:47 ladsgroup@deploy1003: ladsgroup, neriah: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:45 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] * 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:37 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 15:35 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 15:34 jelto: helm uninstall -n miscweb design-strategy - [[phab:T329991|T329991]] * 15:33 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 15:31 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 15:30 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 15:30 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 15:29 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 15:28 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 15:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 15:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm * 15:25 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 15:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 15:24 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts * 15:23 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s) * 15:23 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts * 15:22 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s) * 15:17 dancy@deploy1003: Installing scap version "4.264.0" for 163 host(s) * 15:12 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply * 15:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply * 15:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm * 14:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2220.codfw.wmnet with reason: Maintenance * 14:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 14:54 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 14:54 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:53 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bullseye * 14:47 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bullseye * 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm * 14:45 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 14:44 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply * 14:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bullseye * 14:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bullseye * 14:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage * 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs2009 to dse-k8s-wdqs-test2001 * 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test2001 * 14:33 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test2001 * 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test2001 on all recursors * 14:33 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test2001 on all recursors * 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003" * 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm * 14:32 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage * 14:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003" * 14:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs1028 to dse-k8s-wdqs-test1001 * 14:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage * 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test1001 * 14:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test1001 * 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test1001 on all recursors * 14:26 btullis@cumin1003: START - Cookbook sre.dns.netbox * 14:26 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test1001 on all recursors * 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003" * 14:26 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs2009 to dse-k8s-wdqs-test2001 * 14:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003" * 14:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage * 14:22 btullis@cumin1003: START - Cookbook sre.dns.netbox * 14:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:21 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs1028 to dse-k8s-wdqs-test1001 * 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage * 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage * 14:20 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage * 14:19 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage * 14:17 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply * 14:17 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply * 14:15 Lucas_WMDE: UTC afternoon backport+config window done * 14:15 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] (duration: 07m 02s) * 14:11 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment * 14:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:10 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:10 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1271 * 14:09 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] * 14:08 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply * 14:08 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/sessionstore: apply * 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bullseye * 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bullseye * 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bullseye * 14:07 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bullseye * 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply * 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply * 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply * 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply * 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] (duration: 39m 36s) * 14:06 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 14:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1271 * 14:05 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:05 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Rolling back deployment * 14:05 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 14:04 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1272 * 14:03 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1272 * 14:02 vriley@cumin1003: START - Cookbook sre.dns.netbox * 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003" * 14:02 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003" * 13:57 vriley@cumin1003: START - Cookbook sre.dns.netbox * 13:57 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 13:54 vriley@cumin1003: START - Cookbook sre.dns.netbox * 13:54 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 13:51 vriley@cumin1003: START - Cookbook sre.dns.netbox * 13:51 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm * 13:50 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync * 13:50 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync * 13:49 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye * 13:49 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync * 13:49 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync * 13:49 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync * 13:48 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync * 13:48 ottomata: roll restart eventgate main to pick up mediawiki/page/change/1.4.0 schema version for [[phab:T423583|T423583]] * 13:32 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 13:29 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:27 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] * 13:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2233.codfw.wmnet with reason: Reboot * 13:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2006.codfw.wmnet with reason: Reboot * 13:14 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] (duration: 07m 13s) * 13:09 sbisson@deploy1003: sbisson: Continuing with deployment * 13:08 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:06 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] * 12:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply * 12:38 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply * 12:26 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply * 12:26 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply * {{safesubst:SAL entry|1=12:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T42}} * 12:20 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment * 12:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425940)]] synced * {{safesubst:SAL entry|1=12:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425}} * 12:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] (duration: 07m 45s) * 12:06 kharlan@deploy1003: kharlan: Continuing with deployment * 12:04 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:02 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] * 10:31 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003" * 10:31 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003" * 09:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] (duration: 07m 43s) * 09:51 kharlan@deploy1003: kharlan: Continuing with deployment * 09:50 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 09:48 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] * 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92480 and previous config saved to /var/cache/conftool/dbconfig/20260512-092034-fceratto.json * 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92479 and previous config saved to /var/cache/conftool/dbconfig/20260512-091025-fceratto.json * 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92478 and previous config saved to /var/cache/conftool/dbconfig/20260512-090017-fceratto.json * 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92477 and previous config saved to /var/cache/conftool/dbconfig/20260512-085009-fceratto.json * 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92476 and previous config saved to /var/cache/conftool/dbconfig/20260512-083526-fceratto.json * 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance * 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2150: after reimage to trixie * 08:17 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] * 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1231: after reimage to trixie * 08:08 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply * 08:07 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply * 08:03 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 07m 02s) * 08:00 dcausse@deploy1003: dcausse: Rolling back deployment * 08:00 dcausse@deploy1003: dcausse: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] * 07:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2150: after reimage to trixie * 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2150.codfw.wmnet with OS trixie * 07:29 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1231: after reimage to trixie * 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1231.eqiad.wmnet with OS trixie * 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2150.codfw.wmnet with reason: host reimage * 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage * 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2150.codfw.wmnet with reason: host reimage * 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage * 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2142.codfw.wmnet * 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 06:46 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 06:43 jayme@deploy1003: Finished scap sync-world: update rsyslog image, [[phab:T418200|T418200]] (duration: 07m 56s) * 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1231.eqiad.wmnet with OS trixie * 06:42 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2150.codfw.wmnet with OS trixie * 06:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1231: Reimage to Trixie * 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Reimage to Trixie * 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1231: Reimage to Trixie * 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Reimage to Trixie * 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Reimage to Trixie * 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Reimage to Trixie * 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2142.codfw.wmnet * 06:36 jayme@deploy1003: Started scap sync-world: update rsyslog image, [[phab:T418200|T418200]] * 06:27 jayme@dns1004: END - running authdns-update * 06:26 jayme@dns1004: START - running authdns-update * 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] (duration: 36m 36s) * 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s) * 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 00:37 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply * 00:37 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply * 00:36 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply * 00:35 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply * 00:35 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply * 00:35 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply * 00:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply * 00:14 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply * 00:07 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] (duration: 07m 24s) * 00:03 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 00:02 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:00 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] == 2026-05-11 == * 23:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] (duration: 06m 21s) * 23:41 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 23:40 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:38 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] * 23:24 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] (duration: 06m 29s) * 23:20 jdlrobson@deploy1003: jdlrobson: Continuing with deployment * 23:19 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:18 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] * 21:51 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] (duration: 06m 26s) * 21:47 cjming@deploy1003: cjming: Continuing with deployment * 21:47 cjming@deploy1003: cjming: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:45 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] * 21:29 maryum: Deployed security fix for [[phab:T425406|T425406]] * 21:16 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply * 21:16 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply * 21:15 mstyles@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] (duration: 06m 36s) * 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:11 mstyles@deploy1003: sbassett, mstyles: Continuing with deployment * 21:10 mstyles@deploy1003: sbassett, mstyles: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:09 mstyles@deploy1003: Started scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] * 21:03 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply * 20:54 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:53 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply * 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003" * 20:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003" * 20:49 vriley@cumin1003: START - Cookbook sre.dns.netbox * 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS bookworm * 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:41 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] (duration: 09m 51s) * 20:37 jdrewniak@deploy1003: jdrewniak: Continuing with deployment * 20:36 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:33 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:32 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] * 20:19 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage * 20:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage * 20:02 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] (duration: 06m 57s) * 20:00 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS bookworm * 19:58 zabe@deploy1003: zabe: Continuing with deployment * 19:57 zabe@deploy1003: zabe: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 19:55 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] * 19:44 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye * 19:43 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:40 jmm@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bvibber out of all services on: 2453 hosts * 19:39 inflatador: [bking@cumin2002] ~$ sudo cumin 'A:wdqs-main and A:codfw' 'systemctl restart wdqs-blazegraph' <- restart after banning scraper * 19:25 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 19:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1269 * 19:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1269 * 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003" * 19:22 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003" * 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox * 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS bookworm * 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 19:16 dzahn@dns1005: END - running authdns-update * 19:14 dzahn@dns1005: START - running authdns-update * 19:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 19:11 inflatador: bking@archiva1002 `sudo rm -rfv /var/cache/archiva/temp* && sudo systemctl restart archiva`. to free up disk space * 18:56 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 18:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage * 18:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage * 18:25 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye * 18:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync * 18:13 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync * 18:12 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync * 18:12 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync * 18:12 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync * 18:12 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync * 18:12 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie * 18:12 ottomata: roll restarting eventgate-main to pick up changes for [[phab:T423952|T423952]] * 18:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm * 17:56 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1268.eqiad.wmnet with OS bookworm * 17:56 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm * 17:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 17:53 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 17:52 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye * 17:47 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply * 17:43 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply * 17:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 17:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92464 and previous config saved to /var/cache/conftool/dbconfig/20260511-173804-fceratto.json * 17:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1268 * 17:34 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1268 * 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003" * 17:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003" * 17:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92463 and previous config saved to /var/cache/conftool/dbconfig/20260511-172756-fceratto.json * 17:25 vriley@cumin1003: START - Cookbook sre.dns.netbox * 17:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92462 and previous config saved to /var/cache/conftool/dbconfig/20260511-171747-fceratto.json * 17:15 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye * 17:12 dancy@deploy1003: Installation of scap version "4.263.0" completed for 2 hosts * 17:11 dancy@deploy1003: Installing scap version "4.263.0" for 2 host(s) * 17:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92461 and previous config saved to /var/cache/conftool/dbconfig/20260511-170739-fceratto.json * 17:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply * 17:07 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply * 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 17:07 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 17:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 17:05 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie * 17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92460 and previous config saved to /var/cache/conftool/dbconfig/20260511-170024-fceratto.json * 17:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance * 16:56 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply * 16:51 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply * 16:50 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply * 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 16:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply * 16:39 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'. * 16:39 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'. * 16:39 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'. * 16:38 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'. * 16:37 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. * 16:37 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'. * 16:36 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. * 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'. * 16:27 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] (duration: 06m 54s) * 16:25 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply * 16:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply * 16:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply * 16:23 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply * 16:23 zabe@deploy1003: zabe: Continuing with deployment * 16:22 zabe@deploy1003: zabe: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 16:20 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] * 16:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 16:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 16:02 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 16:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 16:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 16:00 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply * 16:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply * 15:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] (duration: 07m 48s) * 15:54 zabe@deploy1003: zabe: Continuing with deployment * 15:52 zabe@deploy1003: zabe: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:50 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] * 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:46 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] (duration: 06m 32s) * 15:42 zabe@deploy1003: zabe: Continuing with deployment * 15:41 zabe@deploy1003: zabe: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] * 15:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 15:17 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm * 14:55 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lvs2012.codfw.wmnet with reason: DIMM replacement * 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 14:47 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 14:43 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs1017 * 14:42 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host lvs1017 * 14:42 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 14:41 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 14:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:39 Lucas_WMDE: UTC afternoon backport+config window done * 14:39 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]] (duration: 18 * 14:38 vriley@cumin1003: START - Cookbook sre.dns.netbox * 14:33 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Continuing with deployment * {{safesubst:SAL entry|1=14:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now}} * 14:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]] * {{safesubst:SAL entry|1=14:18 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (}} * 14:15 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm * 14:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 14:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL * 14:05 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Continuing with deployment * {{safesubst:SAL entry|1=14:04 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group}} * 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-eqiad@eqiad * 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs * 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS bookworm * 13:56 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs * 13:50 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-eqiad@eqiad * 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-codfw@codfw * 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs * 13:49 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs * 13:47 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-codfw@codfw * 13:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage * {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (T}} * 13:36 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage * 13:34 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 13:34 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 13:32 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 13:32 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 13:30 btullis: restarting pybal on lvs1019 and lvs1020 for [[phab:T420437|T420437]] * 13:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] (duration: 06m 28s) * 13:25 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 13:24 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS bookworm * 13:22 jiji@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc1055.eqiad.wmnet with OS trixie * 13:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment * 13:21 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:21 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 13:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] * 13:19 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'. * 13:19 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'. * 13:18 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'. * 13:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'. * 13:16 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. * 13:15 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'. * 13:14 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. * 13:14 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'. * 13:07 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] (duration: 08m 05s) * 13:06 elukey: remove old discovery pki intermediate * 13:03 otto@deploy1003: otto: Continuing with deployment * 13:01 otto@deploy1003: otto: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:59 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] * 12:59 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. * 12:58 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'. * 12:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] (duration: 12m 07s) * 12:47 kharlan@deploy1003: kharlan: Continuing with deployment * 12:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] * 12:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage * 12:18 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage * 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie * 12:04 topranks: push out updated ACL to Nokia switches for BGP connections ([[phab:T425703|T425703]]) and add BFD config ([[phab:T425813|T425813]]) * 11:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2185.codfw.wmnet with reason: Reboot * 11:31 moritzm: installing Linux 6.12.86 on Trixie hosts * 11:27 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply * 11:27 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply * 11:21 jayme@deploy1003: Finished scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]] (duration: 13m 28s) * 11:21 jayme@deploy1003: Rolling back deployment * 11:08 jayme@deploy1003: Started scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]] * 11:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance * 11:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance * 10:59 jayme: uprading rsyslog to 8.2504.0-1 in all mediawiki deployments - [[phab:T418200|T418200]] * 10:52 taavi@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Clément Goubert out of all services on: 2459 hosts * 10:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance * 10:26 jayme@deploy1003: Finished scap sync-world: update rsyslog image (duration: 03m 48s) * 10:23 jayme@deploy1003: Started scap sync-world: update rsyslog image * 10:22 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply * 10:21 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply * 10:21 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply * 10:21 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply * 10:16 slyngs: Migrate of lvs2012 due to hardware issues * 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 10:11 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] (duration: 30m 15s) * 10:10 moritzm: rebalance routed Ganeti cluster in eqsin [[phab:T421863|T421863]] * 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance * 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance * 10:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance * 10:01 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance * 09:59 kharlan@deploy1003: kharlan: Continuing with deployment * 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:58 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 09:58 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 09:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure * 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure * 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1230: [[phab:T419635|T419635]] * 09:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] * 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:37 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:25 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92456 and previous config saved to /var/cache/conftool/dbconfig/20260511-092010-fceratto.json * 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92454 and previous config saved to /var/cache/conftool/dbconfig/20260511-091001-fceratto.json * 09:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 09:08 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:07 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install5004.wikimedia.org to drbd * 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92453 and previous config saved to /var/cache/conftool/dbconfig/20260511-085954-fceratto.json * 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 08:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1230: [[phab:T419635|T419635]] * 08:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance * 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance * 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92451 and previous config saved to /var/cache/conftool/dbconfig/20260511-084945-fceratto.json * 08:43 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install5004.wikimedia.org to drbd * 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92450 and previous config saved to /var/cache/conftool/dbconfig/20260511-084236-fceratto.json * 08:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01 * 08:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance * 08:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01 * 08:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet * 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet * 08:10 slyngshede@dns1004: END - running authdns-update * 08:08 slyngshede@dns1004: START - running authdns-update * 08:05 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 08:05 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003" * 08:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003" * 07:56 ayounsi@cumin1003: START - Cookbook sre.dns.netbox * 07:55 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply * 07:50 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. * 07:49 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'. * 07:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. * 07:48 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. * 07:47 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply * 07:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 07:23 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply * 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply * 07:08 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) zarcillo.discovery.wmnet on all recursors * 07:08 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache zarcillo.discovery.wmnet on all recursors * 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm * 06:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage * 06:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage * 06:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief2002.codfw.wmnet * 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM acmechief2002.codfw.wmnet * 06:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast6003.wikimedia.org * 05:57 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast6003.wikimedia.org * 05:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image == 2026-05-10 == * 18:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]] * 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]] * 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]] * 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]] * 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]] * 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]] * 18:09 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]] * 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image == 2026-05-09 == * 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003" * 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003 * 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003 * 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003" * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s) * 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS bookworm * 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage * 00:44 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage * 00:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS bookworm * 00:17 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED == 2026-05-08 == * 23:55 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 23:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1267 * 23:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1267 * 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003" * 23:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003" * 23:26 vriley@cumin1003: START - Cookbook sre.dns.netbox * 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS bookworm * 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 23:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 22:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage * 22:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage * 22:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS bookworm * 22:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1266 * 21:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1266 * 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003" * 21:51 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003" * 21:45 vriley@cumin1003: START - Cookbook sre.dns.netbox * 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS bookworm * 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 21:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage * 21:19 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage * 20:54 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS bookworm * 20:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:32 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1265 * 20:30 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1265 * 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003" * 20:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003" * 20:24 vriley@cumin1003: START - Cookbook sre.dns.netbox * 20:01 ryankemper: [WDQS] Added several more requestctl rules. They've helped marginally, but not enough to restore the service. Unless we find an obvious smoking gun, expect noise to continue for the timebeing :/ * 19:42 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 19:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 19:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 19:40 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 18:07 ryankemper: [WDQS] After those 2 requestctl rules, requests went down 20%, error rate decreased significantly, p50 cut almost in half, but the service is still unstable, likely we'll need to identify more throttle-candidates to restore full health * 17:53 ryankemper: [WDQS] Deployed 2 new requestctl rules; we'll see if it helps * 16:51 topranks: enable bfd on system0.0 sub-interface ssw1-d1-eqiad * 15:45 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup1003.eqiad.wmnet with reason: restart * 15:37 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup[1006,1017-1018].eqiad.wmnet with reason: restart * 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet * 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet * 14:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 10:51 btullis: re-pooled wdqs-main in eqiad for [[phab:T425758|T425758]] * 10:50 btullis@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=wdqs-main,name=eqiad * 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 10:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup1007.eqiad.wmnet with reason: restart * 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:44 btullis: depooled wdqs-main in eqiad for [[phab:T425758|T425758]] * 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:40 btullis@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=wdqs-main,name=eqiad * 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92437 and previous config saved to /var/cache/conftool/dbconfig/20260508-093251-fceratto.json * 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92435 and previous config saved to /var/cache/conftool/dbconfig/20260508-092243-fceratto.json * 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92434 and previous config saved to /var/cache/conftool/dbconfig/20260508-091238-fceratto.json * 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92433 and previous config saved to /var/cache/conftool/dbconfig/20260508-090230-fceratto.json * 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92432 and previous config saved to /var/cache/conftool/dbconfig/20260508-085217-fceratto.json * 08:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance * 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92431 and previous config saved to /var/cache/conftool/dbconfig/20260508-085018-fceratto.json * 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92430 and previous config saved to /var/cache/conftool/dbconfig/20260508-084010-fceratto.json * 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92429 and previous config saved to /var/cache/conftool/dbconfig/20260508-083003-fceratto.json * 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92428 and previous config saved to /var/cache/conftool/dbconfig/20260508-081954-fceratto.json * 08:18 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 08:17 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92427 and previous config saved to /var/cache/conftool/dbconfig/20260508-080438-fceratto.json * 08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance * 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 07:56 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install5003.wikimedia.org * 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 07:09 jmm@cumin2002: START - Cookbook sre.dns.netbox * 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2159: after reimage to trixie * 06:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts install5003.wikimedia.org * 06:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2159: after reimage to trixie * 06:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2159.codfw.wmnet with OS trixie * 06:11 moritzm: installing postorius security updates * 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2159.codfw.wmnet with reason: host reimage * 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2159.codfw.wmnet with reason: host reimage * 05:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2159.codfw.wmnet with OS trixie * 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2159: Reimage to Trixie * 05:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2159: Reimage to Trixie * 05:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Reimage to Trixie * 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1024.eqiad.wmnet with OS trixie * 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 03:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage * 02:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage * 02:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1024.eqiad.wmnet with OS trixie * 02:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:07 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 02:07 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1024 * 02:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1024 * 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003" * 02:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003" * 02:01 vriley@cumin1003: START - Cookbook sre.dns.netbox * 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1023.eqiad.wmnet with OS trixie * 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 01:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 01:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage * 01:11 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage * 00:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1023.eqiad.wmnet with OS trixie * 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:37 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 00:37 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1023 * 00:36 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1023 * 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003" * 00:27 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003" * 00:20 vriley@cumin1003: START - Cookbook sre.dns.netbox == 2026-05-07 == * 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1022.eqiad.wmnet with OS trixie * 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 23:24 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 23:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage * 23:05 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage * 22:53 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie * 22:25 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19] (duration: 01m 53s) * 22:23 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19] * 22:23 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19] (duration: 03m 52s) * 22:19 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19] * 22:18 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19] (duration: 01m 55s) * 22:16 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19] * {{safesubst:SAL entry|1=21:27 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)}} * 21:23 cscott@deploy1003: cscott: Continuing with deployment * 21:17 cscott@deploy1003: cscott: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]] synced to the t * {{safesubst:SAL entry|1=21:16 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]}} * 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1021.eqiad.wmnet with OS trixie * 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:49 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] (duration: 06m 38s) * 20:48 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003" * 20:45 kemayo@deploy1003: esanders, kemayo: Continuing with deployment * 20:44 kemayo@deploy1003: esanders, kemayo: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v * 20:42 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] * 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php commonswiki * 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php nlwiki * 20:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage * 20:30 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage * 20:29 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 20:28 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 20:14 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] (duration: 07m 18s) * 20:10 arlolra@deploy1003: arlolra, mmartorana: Continuing with deployment * 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie * 20:09 arlolra@deploy1003: arlolra, mmartorana: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:07 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] * 20:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1022.eqiad.wmnet with OS trixie * 19:59 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 19:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 19:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie * 19:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 18:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 18:51 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1022 * 18:49 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1022 * 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003" * 18:49 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003" * 18:45 vriley@cumin1003: START - Cookbook sre.dns.netbox * 18:26 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply * 18:26 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/editor-analytics: apply * 18:25 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply * 18:24 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply * 18:22 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply * 18:22 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply * 18:21 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply * 18:21 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply * 18:20 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply * 18:19 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/geo-analytics: apply * 18:19 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply * 18:18 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply * 18:17 brennen@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] * 18:06 cdanis@dns1005: END - running authdns-update * 18:04 cdanis@dns1005: START - running authdns-update * 18:02 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] (duration: 29m 24s) * 18:02 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to all wikis * 17:59 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply * 17:58 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply * 17:51 krinkle@deploy1003: krinkle: Continuing with deployment * 17:50 krinkle@deploy1003: krinkle: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 17:45 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply * 17:45 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply * 17:33 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] * 17:32 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply * 17:32 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply * 17:06 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2200.codfw.wmnet,db1216.eqiad.wmnet with reason: restart * 16:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2199.codfw.wmnet,db1245.eqiad.wmnet with reason: restart * 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 16:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 16:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 16:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 16:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 16:32 jynus: restarting backup1-* database primary hosts * 16:30 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2183.codfw.wmnet,db1204.eqiad.wmnet with reason: restart * 16:25 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 36 hosts with reason: restart * 16:14 sukhe@dns1004: END - running authdns-update * 16:13 sukhe@dns1004: START - running authdns-update * 16:13 sukhe@dns1004: START - running authdns-update * 16:12 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 16:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox) * 16:01 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 15:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet,ms-backup[1003-1004].eqiad.wmnet with reason: restart * 15:44 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox) * 15:32 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply * 15:32 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply * 15:31 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified] * 15:31 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply * 15:31 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified] * 15:31 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply * 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica * 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica * 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica * 15:23 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica * 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica * 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica * 15:22 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 39 hosts * 15:22 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 39 hosts * 15:18 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica * 15:18 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica * 15:15 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.ulsfo.wmnet * 15:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified] * 15:12 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified] * 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 15:05 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 15:05 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 15:03 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply * 15:03 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply * 15:03 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 15:01 akhatun: Deployed refinery using scap, then deployed onto hdfs * 14:58 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad * 14:54 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply * 14:54 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply * 14:54 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply * 14:54 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply * 14:53 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply * 14:53 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply * 14:52 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 14:52 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply * 14:52 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply * 14:50 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 14:44 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c] (duration: 02m 01s) * 14:43 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply * 14:43 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply * 14:42 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c] * 14:40 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c] (duration: 04m 38s) * 14:40 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad * 14:37 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply * 14:36 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply * 14:36 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c] * 14:35 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply * 14:35 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply * 14:33 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c] (duration: 01m 54s) * 14:32 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]] * 14:32 slyngshede@dns1004: END - running authdns-update * 14:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:31 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c] * 14:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:31 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply * 14:30 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply * 14:30 slyngshede@dns1004: START - running authdns-update * 14:30 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply * 14:30 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply * 14:30 akhatun: Deploying Refinery at {{Gerrit|4734c67}} for weekly deployment train * 14:30 jmm@dns1004: END - running authdns-update * 14:29 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply * 14:28 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply * 14:28 jmm@dns1004: START - running authdns-update * 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003" * 14:28 slyngshede@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003" * 14:26 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply * 14:26 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply * 14:25 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply * 14:25 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply * 14:24 slyngshede@cumin1003: START - Cookbook sre.dns.netbox * 14:12 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw * 14:12 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply * 14:12 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply * 14:10 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply * 14:10 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply * 13:53 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw * 13:34 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] (duration: 09m 05s) * 13:30 stran@deploy1003: stran: Continuing with deployment * 13:27 stran@deploy1003: stran: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] * 13:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 13:10 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] (duration: 06m 55s) * 13:06 jforrester@deploy1003: rzl, jforrester, hartman: Continuing with deployment * 13:05 jforrester@deploy1003: rzl, jforrester, hartman: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:03 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] * 13:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=dns4004.wikimedia.org [reason: ulsfo switch refresh [[phab:T408892|T408892]]] * 12:58 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 12:55 sukhe@cumin1003: START - Cookbook sre.dns.netbox * 12:51 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 12:51 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 12:51 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 12:50 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 12:45 sukhe@dns1004: FAIL - running authdns-update * 12:44 sukhe@dns1004: START - running authdns-update * 12:30 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1205.eqiad.wmnet with OS trixie * 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install5004.wikimedia.org * 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install5004.wikimedia.org with OS bookworm * 12:23 slyngshede@dns1004: FAIL - running authdns-update * 12:21 slyngshede@dns1004: START - running authdns-update * 12:18 moritzm: installing init-system-helpers bugfix updates from Bookworm point release * 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003" * 12:17 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003" * 12:12 slyngshede@dns1004: FAIL - running authdns-update * 12:11 slyngshede@dns1004: START - running authdns-update * 12:11 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply * 12:11 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply * 12:11 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo,service=authdns-update [reason: ulsfo switch refresh [[phab:T408892|T408892]]] * 12:08 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage * 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2168: after reimage to trixie * 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install5004.wikimedia.org with reason: host reimage * 12:02 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 12:02 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 12:02 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage * 12:00 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install5004.wikimedia.org with reason: host reimage * 11:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: after reimage to trixie * 11:47 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1205.eqiad.wmnet with OS trixie * 11:46 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1205.eqiad.wmnet with reason: reimage * 11:43 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 11:43 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 11:40 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS trixie * 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org * 11:36 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 11:35 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 11:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org * 11:20 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2168: after reimage to trixie * 11:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage * 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2168.codfw.wmnet with OS trixie * 11:16 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 11:15 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 11:15 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage * 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92412 and previous config saved to /var/cache/conftool/dbconfig/20260507-111424-fceratto.json * 11:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1227: after reimage to trixie * 11:11 moritzm: instaling modsecurity-apache security updates * 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1227.eqiad.wmnet with OS trixie * 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install5004.wikimedia.org with OS bookworm * 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92409 and previous config saved to /var/cache/conftool/dbconfig/20260507-110415-fceratto.json * 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002" * 11:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002" * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors * 11:03 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 11:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 10:59 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply * 10:59 jmm@cumin2002: START - Cookbook sre.dns.netbox * 10:59 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org * 10:58 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply * 10:58 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2184 * 10:58 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2184 * 10:57 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2184 * 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 10:57 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org * 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors * 10:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002" * 10:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002" * 10:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] (duration: 08m 40s) * 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2168.codfw.wmnet with reason: host reimage * 10:54 root@cumin1003: START - Cookbook sre.dns.netbox * 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92407 and previous config saved to /var/cache/conftool/dbconfig/20260507-105407-fceratto.json * 10:51 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 10:51 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2168.codfw.wmnet with reason: host reimage * 10:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 10:49 jmm@cumin2002: START - Cookbook sre.dns.netbox * 10:49 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2184 * 10:48 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS trixie * 10:48 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage * 10:48 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 10:47 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 10:47 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 10:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] * 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92406 and previous config saved to /var/cache/conftool/dbconfig/20260507-104359-fceratto.json * 10:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage * 10:40 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2184.codfw.wmnet with reason: reimage * 10:40 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox * 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors * 10:39 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 10:39 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 10:35 jmm@cumin2002: START - Cookbook sre.dns.netbox * 10:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org * 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92405 and previous config saved to /var/cache/conftool/dbconfig/20260507-103349-fceratto.json * 10:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance * 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2168.codfw.wmnet with OS trixie * 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5002.wikimedia.org * 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 10:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2168: Reimage to Trixie * 10:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2168: Reimage to Trixie * 10:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Reimage to Trixie * 10:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2182: after reimage to trixie * 10:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1227.eqiad.wmnet with OS trixie * 10:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1227: Reimage to Trixie * 10:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1227: Reimage to Trixie * 10:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Reimage to Trixie * 10:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 10:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: after reimage to trixie * 10:21 jmm@cumin2002: START - Cookbook sre.dns.netbox * 10:21 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply * 10:20 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply * 10:16 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5002.wikimedia.org * 10:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]] * 10:14 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]] * 10:13 moritzm: rebalance ganti cluster in ulsfo following host reimages [[phab:T424686|T424686]] * 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5001.wikimedia.org * 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 10:11 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS trixie * 10:10 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 10:04 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 10:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 10:03 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 09:59 jmm@cumin2002: START - Cookbook sre.dns.netbox * 09:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5001.wikimedia.org * 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage * 09:49 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage * 09:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2182: after reimage to trixie * 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd * 09:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2182.codfw.wmnet with OS trixie * 09:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1202: after reimage to trixie * 09:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1202.eqiad.wmnet with OS trixie * 09:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance * 09:32 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd * 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus4003.ulsfo.wmnet to drbd * 09:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1006.eqiad.wmnet * 09:24 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1006.eqiad.wmnet * 09:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie * 09:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2182.codfw.wmnet with reason: host reimage * 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast4006.wikimedia.org * 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage * 09:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2182.codfw.wmnet with reason: host reimage * 09:11 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast4006.wikimedia.org * 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2208: After reimage * 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage * 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2182.codfw.wmnet with OS trixie * 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1202.eqiad.wmnet with OS trixie * 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Reimage to Trixie * 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2182: Reimage to Trixie * 08:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2182: Reimage to Trixie * 08:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Reimage to Trixie * 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Reimage to Trixie * 08:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Reimage to Trixie * 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2144.codfw.wmnet * 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 08:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 08:37 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 08:32 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2144.codfw.wmnet * 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus4003.ulsfo.wmnet to drbd * 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4004.ulsfo.wmnet to drbd * 08:28 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2144 [[phab:T425522|T425522]]', diff saved to https://phabricator.wikimedia.org/P92389 and previous config saved to /var/cache/conftool/dbconfig/20260507-082822-marostegui.json * 08:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage * 08:23 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2208: After reimage * 08:23 XioNoX: drmrs remove old v6 gateway IP * 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003" * 08:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage * 08:21 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003" * 08:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox * 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4004.ulsfo.wmnet to drbd * 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01 * 08:12 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync * 08:12 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync * 08:12 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01 * 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet * 08:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet * 08:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01 * 08:03 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01 * 07:54 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] (duration: 09m 46s) * 07:49 dcausse@deploy1003: dcausse: Continuing with deployment * 07:46 dcausse@deploy1003: dcausse: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow4003.ulsfo.wmnet to drbd * 07:44 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] * 07:32 moritzm: installing apache2 security updates * 07:30 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow4003.ulsfo.wmnet to drbd * 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2005.codfw.wmnet * 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2005.codfw.wmnet * 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4003.ulsfo.wmnet to drbd * 06:48 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd * 06:46 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of ncredir4003.ulsfo.wmnet to drbd * 06:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd * 06:42 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01 * 06:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01 * 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2207: after reimage to trixie * 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2207: after reimage to trixie * 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2207.codfw.wmnet with OS trixie * 05:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2208.codfw.wmnet with OS trixie * 05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2207.codfw.wmnet with reason: host reimage * 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2207.codfw.wmnet with reason: host reimage * 05:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2208.codfw.wmnet with reason: host reimage * 05:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2208.codfw.wmnet with reason: host reimage * 05:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2207.codfw.wmnet with OS trixie * 05:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2207: Reimage to Trixie * 05:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2207: Reimage to Trixie * 05:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Reimage to Trixie * 04:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2207 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92383 and previous config saved to /var/cache/conftool/dbconfig/20260507-045219-marostegui.json * 04:51 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2204 to s2 primary [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92382 and previous config saved to /var/cache/conftool/dbconfig/20260507-045141-marostegui.json * 04:51 marostegui: Starting s2 codfw failover from db2207 to db2204 - [[phab:T424848|T424848]] * 04:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T424848|T424848]] * 04:46 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2204 with weight 0 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92381 and previous config saved to /var/cache/conftool/dbconfig/20260507-044651-marostegui.json * 04:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s) * 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:15 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] (duration: 12m 57s) * 01:09 zabe@deploy1003: zabe: Continuing with deployment * 01:09 zabe@deploy1003: zabe: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 01:02 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] * 01:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie * 00:43 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] (duration: 33m 54s) * 00:31 zabe@deploy1003: zabe: Continuing with deployment * 00:29 zabe@deploy1003: zabe: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:10 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] == 2026-05-06 == * 23:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie * 23:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie * 23:14 ladsgroup@deploy1003: Synchronized portals: Sync portals for removal of Wikinews (duration: 02m 22s) * 23:12 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Sync portals for removal of Wikinews (duration: 06m 12s) * 22:50 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] (duration: 07m 08s) * 22:46 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 22:45 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:43 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] * 22:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] (duration: 06m 40s) * 22:28 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 22:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] * 22:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie * 22:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 22:14 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] (duration: 06m 25s) * 22:11 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 22:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 22:10 cjming@deploy1003: cjming: Continuing with deployment * 22:10 cjming@deploy1003: cjming: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 22:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 22:08 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] * 22:06 vriley@cumin1003: START - Cookbook sre.dns.netbox * 22:05 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021 * 22:04 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021 * 21:52 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] (duration: 06m 56s) * 21:48 zabe@deploy1003: zabe: Continuing with deployment * 21:47 zabe@deploy1003: zabe: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:45 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] * 21:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:28 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS trixie * 21:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:24 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:22 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:17 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:15 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003" * 21:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003" * 21:07 vriley@cumin1003: START - Cookbook sre.dns.netbox * 21:06 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021 * 21:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021 * 21:04 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage * 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 20:28 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] (duration: 09m 12s) * 20:25 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage * 20:24 catrope@deploy1003: catrope, somerandomdeveloper: Continuing with deployment * 20:21 catrope@deploy1003: catrope, somerandomdeveloper: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:19 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] * 20:14 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie * 20:00 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1012.eqiad.wmnet with OS trixie * 19:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie * 19:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS trixie * 19:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie * 19:23 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie * 19:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS bookworm * 19:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage * 19:01 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage * 18:59 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 18:59 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 18:59 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 18:59 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 18:59 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 18:59 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 18:59 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage * 18:55 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 18:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 18:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 18:54 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage * 18:54 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage * 18:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 18:53 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 18:53 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 18:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie * 18:48 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage * 18:47 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 18:47 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 18:42 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 18:42 brennen@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] * 18:42 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 18:41 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 18:40 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 18:40 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 18:39 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 18:37 dzahn@dns1005: END - running authdns-update * 18:35 dzahn@dns1005: START - running authdns-update * 18:33 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to group1 * 18:31 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie * 18:29 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS bookworm * 18:02 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo * 18:01 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo * 17:59 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002 * 17:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo * 17:55 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo * 17:37 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply * 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply * 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply * 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply * 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply * 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply * 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply * 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply * 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply * 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply * 17:32 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply * 17:31 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply * 17:28 topranks: rebooting asw1-23-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]] * 17:27 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-23-ulsfo,asw1-23-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo * 17:20 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply * 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply * 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply * 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply * 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply * 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply * 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply * 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply * 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply * 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply * 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply * 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply * 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply * 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply * 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply * 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply * 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply * 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply * 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply * 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply * 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply * 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply * 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply * 17:06 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply * 17:02 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on 39 hosts with reason: ulsfo depooled for switch work * 16:53 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-22-ulsfo,asw1-22-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo * 16:52 topranks: rebooting asw1-22-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]] * 16:45 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 16:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS trixie * 16:39 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 16:37 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns4004.wikimedia.org with OS bookworm * 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 16:28 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS bookworm * 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 16:28 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 16:09 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage * 16:04 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns4004.wikimedia.org with reason: host reimage * 15:58 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage * 15:57 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns4004.wikimedia.org with reason: host reimage * 15:38 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS bookworm * 15:35 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host dns4004.wikimedia.org with OS bookworm * 15:30 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution. * 15:08 sukhe: sudo cumin -b1 -s5 "C:bird and not dns4004*" "run-puppet-agent --enable 'merging CR 1282958'" * 15:08 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution. * 15:06 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] (duration: 06m 41s) * 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 15:01 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] * 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5002.eqsin.wmnet * 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 14:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie * 14:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 14:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:41 jmm@cumin2002: START - Cookbook sre.dns.netbox * 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:35 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 14:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns7001.wikimedia.org [reason: testing bird change] * 14:31 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=dns7001.wikimedia.org [reason: testing bird change] * 14:30 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] (duration: 11m 16s) * 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage * 14:26 kharlan@deploy1003: kharlan: Continuing with deployment * 14:25 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1282958'" * 14:23 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage * 14:22 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 14:21 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 14:21 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 14:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5002.eqsin.wmnet * 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm * 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002" * 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002" * 14:20 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 14:19 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] * 14:19 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 14:18 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5001.eqsin.wmnet * 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 14:15 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] (duration: 06m 40s) * 14:13 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply * 14:13 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply * 14:12 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply * 14:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 14:11 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply * 14:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie * 14:10 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 14:10 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply * 14:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:10 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply * 14:09 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:08 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] * 14:08 jmm@cumin2002: START - Cookbook sre.dns.netbox * 14:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] (duration: 11m 28s) * 14:02 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5001.eqsin.wmnet * 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage * 13:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 13:56 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage * 13:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:55 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie * 13:53 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002 * 13:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1194: after reimage to trixie * 13:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] * 13:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage * 13:45 jgreen@dns1004: END - running authdns-update * 13:44 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] (duration: 30m 53s) * 13:44 jgreen@dns1004: START - running authdns-update * 13:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage * 13:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage * 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm * 13:35 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.ulsfo.wmnet on all recursors * 13:34 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.ulsfo.wmnet on all recursors * 13:32 alexsanford@deploy1003: alexsanford: Continuing with deployment * 13:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:31 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage * 13:31 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet'] * 13:28 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet'] * 13:28 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet'] * 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet'] * 13:27 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet'] * 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet'] * 13:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:26 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:24 pt1979@cumin2002: START - Cookbook sre.dns.netbox * 13:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:20 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:19 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.mgmt.ulsfo.wmnet on all recursors * 13:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie * 13:19 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.mgmt.ulsfo.wmnet on all recursors * 13:19 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003" * 13:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003" * 13:15 cmooney@cumin1003: START - Cookbook sre.dns.netbox * 13:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage * 13:14 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 13:13 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] * 13:12 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS trixie * 13:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 13:08 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage * 13:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1194: after reimage to trixie * 13:05 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 13:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1194.eqiad.wmnet with OS trixie * 12:49 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS trixie * 12:45 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 12:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2012.codfw.wmnet with OS trixie * 12:39 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 12:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage * 12:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: update * 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage * 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage * 12:21 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage * 12:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1194.eqiad.wmnet with OS trixie * 12:20 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie * 12:16 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie * 12:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 12:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2011.codfw.wmnet with OS trixie * 12:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] (duration: 06m 28s) * 12:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 12:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 12:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2012.codfw.wmnet with OS trixie * 12:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] * 12:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 11:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage * 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet * 11:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage * 11:50 moritzm: installing openjdk-17 security updates * 11:50 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage * 11:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92374 and previous config saved to /var/cache/conftool/dbconfig/20260506-114919-fceratto.json * 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet * 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1194: Reimage to Trixie * 11:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2160.codfw.wmnet with reason: Reboot * 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1194: Reimage to Trixie * 11:44 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage * 11:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Reimage to Trixie * 11:42 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage * 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm * 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002" * 11:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage * 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92372 and previous config saved to /var/cache/conftool/dbconfig/20260506-113910-fceratto.json * 11:30 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2011.codfw.wmnet with OS trixie * 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92371 and previous config saved to /var/cache/conftool/dbconfig/20260506-112903-fceratto.json * 11:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002" * 11:20 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie * 11:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie * 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92370 and previous config saved to /var/cache/conftool/dbconfig/20260506-111854-fceratto.json * 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie * 11:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie * 11:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1217.eqiad.wmnet with reason: Reboot * 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage * 10:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage * 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage * 10:44 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage * 10:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage * 10:39 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage * 10:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm * 10:29 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet'] * 10:23 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet'] * 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet'] * 10:22 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet'] * 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92369 and previous config saved to /var/cache/conftool/dbconfig/20260506-101836-fceratto.json * 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance * 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92368 and previous config saved to /var/cache/conftool/dbconfig/20260506-101808-fceratto.json * 10:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie * 10:16 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie * 10:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie * 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92367 and previous config saved to /var/cache/conftool/dbconfig/20260506-100800-fceratto.json * 09:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92366 and previous config saved to /var/cache/conftool/dbconfig/20260506-095752-fceratto.json * 09:55 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 09:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92365 and previous config saved to /var/cache/conftool/dbconfig/20260506-094744-fceratto.json * 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage * 09:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage * 09:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 09:32 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 09:31 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003" * 09:27 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED * 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92364 and previous config saved to /var/cache/conftool/dbconfig/20260506-092414-fceratto.json * 09:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance * 09:23 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003" * 09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92363 and previous config saved to /var/cache/conftool/dbconfig/20260506-092345-fceratto.json * 09:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox * 09:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie * 09:16 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97) * 09:15 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2005.codfw.wmnet with reason: update * 09:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms2 [[phab:T418979|T418979]]ç', diff saved to https://phabricator.wikimedia.org/P92362 and previous config saved to /var/cache/conftool/dbconfig/20260506-091513-marostegui.json * 09:14 ayounsi@cumin1003: START - Cookbook sre.dns.netbox * 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2253: Replacing HW [[phab:T418979|T418979]] * 09:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99) * 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2253: Replacing HW [[phab:T418979|T418979]] * 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92361 and previous config saved to /var/cache/conftool/dbconfig/20260506-091337-fceratto.json * 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92360 and previous config saved to /var/cache/conftool/dbconfig/20260506-090329-fceratto.json * 09:03 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] (duration: 08m 44s) * 08:59 zabe@deploy1003: zabe: Continuing with deployment * 08:56 zabe@deploy1003: zabe: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 08:54 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] * 08:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92359 and previous config saved to /var/cache/conftool/dbconfig/20260506-085321-fceratto.json * 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2253 to ms2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92358 and previous config saved to /var/cache/conftool/dbconfig/20260506-084337-marostegui.json * 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92357 and previous config saved to /var/cache/conftool/dbconfig/20260506-083841-fceratto.json * 08:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance * 08:29 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie * 08:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie * 08:08 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie * 08:06 awight: EU morning deployment is done * 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Replacing hw * 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Replacing HW [[phab:T418979|T418979]] * 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0) * 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache * 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Replacing HW [[phab:T418979|T418979]] * 07:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie * 07:40 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] (duration: 08m 58s) * 07:36 awight@deploy1003: wmde-fisch, awight, dcausse: Continuing with deployment * 07:33 awight@deploy1003: wmde-fisch, awight, dcausse: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can * 07:31 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] * 07:26 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] (duration: 07m 37s) * 07:22 awight@deploy1003: awight, lilients: Continuing with deployment * 07:21 awight@deploy1003: awight, lilients: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:19 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] * 07:14 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4008.ulsfo.wmnet * 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 06:55 jmm@cumin2002: START - Cookbook sre.dns.netbox * 06:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1191: after reimage to trixie * 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1189: after reimage to trixie * 06:48 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4008.ulsfo.wmnet * 06:48 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4006.ulsfo.wmnet * 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 06:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" * 06:26 jmm@cumin2002: START - Cookbook sre.dns.netbox * 06:20 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4006.ulsfo.wmnet * 05:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2208.codfw.wmnet with reason: Idrac issues [[phab:T425506|T425506]] * 05:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage * 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage * 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage * 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage * 05:26 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2208: Reimage to Trixie * 05:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie * 05:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie * 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2208: Reimage to Trixie * 05:24 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie * 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie * 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1191.eqiad.wmnet with OS trixie * 05:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1191: Reimage to Trixie * 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1191: Reimage to Trixie * 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Reimage to Trixie * 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1189.eqiad.wmnet with OS trixie * 05:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1189: Reimage to Trixie * 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1189: Reimage to Trixie * 05:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1189.eqiad.wmnet with reason: Reimage to Trixie * 05:11 marostegui@dns1004: END - running authdns-update * 05:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1189 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92345 and previous config saved to /var/cache/conftool/dbconfig/20260506-050948-marostegui.json * 05:09 marostegui@dns1004: START - running authdns-update * 05:08 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1223 to s3 primary and set section read-write [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92344 and previous config saved to /var/cache/conftool/dbconfig/20260506-050816-marostegui.json * 05:07 marostegui@cumin1003: dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92343 and previous config saved to /var/cache/conftool/dbconfig/20260506-050755-marostegui.json * 05:06 marostegui: Starting s3 eqiad failover from db1189 to db1223 - [[phab:T425318|T425318]] * 05:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T425318|T425318]] * 05:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1223 with weight 0 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92342 and previous config saved to /var/cache/conftool/dbconfig/20260506-050342-marostegui.json * 03:28 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 37s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:05 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1001.eqiad.wmnet with OS trixie * 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] (duration: 06m 26s) * 00:49 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 00:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] * 00:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage * 00:41 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage * 00:27 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] (duration: 07m 26s) * 00:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1001 * 00:25 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1001 * 00:24 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1001.eqiad.wmnet with OS trixie * 00:23 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 00:21 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:20 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] == 2026-05-05 == * 23:31 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 23:30 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002" * 23:30 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002" * 23:26 pt1979@cumin2002: START - Cookbook sre.dns.netbox * 22:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] (duration: 06m 58s) * 22:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 22:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] * 22:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] (duration: 06m 28s) * 22:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 22:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:37 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] * 22:26 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] (duration: 07m 56s) * 22:22 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 22:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] * 22:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] (duration: 06m 45s) * 22:12 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 22:11 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 22:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] * 22:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] (duration: 11m 07s) * 21:59 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 21:58 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:54 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] * 21:49 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] (duration: 32m 55s) * 21:36 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Continuing with deployment * 21:33 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:16 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] * 20:59 dancy@deploy1003: Installation of scap version "4.262.1" completed for 2 hosts * 20:57 dancy@deploy1003: Installing scap version "4.262.1" for 2 host(s) * 20:57 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] (duration: 10m 59s) * 20:52 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Continuing with deployment * 20:48 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be ve * 20:46 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] * 20:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie * 20:22 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] (duration: 10m 30s) * 20:20 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1002.eqiad.wmnet with OS trixie * 20:18 arlolra@deploy1003: aaron, neriah, arlolra: Continuing with deployment * 20:14 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage * 20:13 arlolra@deploy1003: aaron, neriah, arlolra: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:12 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] * 20:10 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 20:09 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 20:09 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 20:09 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 20:09 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 20:09 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 20:07 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage * 20:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage * 19:57 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage * 19:55 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'. * 19:55 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'. * 19:54 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'. * 19:54 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'. * 19:45 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie * 19:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1002 * 19:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1002 * 19:41 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 19:41 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 19:40 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 19:40 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 19:39 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1002 * 19:39 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 19:39 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 19:39 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 19:39 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003" * 19:38 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003" * 19:32 herron@cumin1003: START - Cookbook sre.dns.netbox * 19:31 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 19:31 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 19:31 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 19:31 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 19:31 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 19:30 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 19:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1002 * 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1002.eqiad.wmnet with OS trixie * 19:17 dancy@deploy1003: Installation of scap version "4.262.0" completed for 2 hosts * 19:15 dancy@deploy1003: Installing scap version "4.262.0" for 2 host(s) * 19:15 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: rebooting firewall in desperation * 19:14 brennen@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] * 19:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]" * 19:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]" * 19:04 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie * 19:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] (duration: 10m 59s) * 18:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 18:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] * 18:49 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie * 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002" * 18:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002" * 18:47 brennen@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] (duration: 36m 04s) * 18:44 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie * 18:44 pt1979@cumin2002: START - Cookbook sre.dns.netbox * 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002" * 18:44 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002" * 18:40 pt1979@cumin2002: START - Cookbook sre.dns.netbox * 18:30 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie * 18:25 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie * 18:14 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-codfw * 18:13 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-codfw * 18:13 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie * 18:11 brennen@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] * 18:10 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-eqiad * 18:10 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-eqiad * 18:06 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): no current blockers, rolling to group0 * 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1003.eqiad.wmnet with OS trixie * 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage * 17:38 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage * 17:33 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'. * 17:32 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'. * 17:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'. * 17:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'. * 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1003 * 17:21 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1003 * 17:21 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1003 * 17:20 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 17:19 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 17:16 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 17:15 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 17:15 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 17:15 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003" * 17:15 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003" * 17:12 herron@cumin1003: START - Cookbook sre.dns.netbox * 17:09 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1003 * 17:08 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1003.eqiad.wmnet with OS trixie * 17:05 sukhe: sudo cumin -b11 "A:cp and not P<nowiki>{</nowiki>cp2041* or cp2042*<nowiki>}</nowiki> and not A:ulsfo" "run-puppet-agent --enable 'merging CR 1282979'" * 16:58 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] (duration: 07m 25s) * 16:53 sbassett@deploy1003: mstyles, sbassett: Continuing with deployment * 16:52 sbassett@deploy1003: mstyles, sbassett: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdeb * 16:50 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] * 16:38 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]] * 16:19 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync * 16:19 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync * 16:19 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync * 16:18 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync * 16:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] (duration: 06m 16s) * 16:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 16:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 16:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] * 16:01 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] (duration: 07m 53s) * 15:57 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 15:55 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync * 15:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:55 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync * 15:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync * 15:54 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync * 15:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] * 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] (duration: 06m 12s) * 15:48 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 15:47 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] * 15:42 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync * 15:42 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync * 15:39 dzahn@dns1005: END - running authdns-update * 15:38 mutante: deleting mwmaint.discovery.wmnet DNS entry - the hosts behind it dont exist anymore * 15:37 dzahn@dns1005: START - running authdns-update * 15:24 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply * 15:24 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply * 15:21 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply * 15:20 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply * 15:20 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply * 15:20 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply * 15:20 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] (duration: 06m 17s) * 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92340 and previous config saved to /var/cache/conftool/dbconfig/20260505-151930-fceratto.json * 15:16 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 15:16 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:14 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] * 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92339 and previous config saved to /var/cache/conftool/dbconfig/20260505-150921-fceratto.json * 15:08 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] (duration: 07m 06s) * 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . * 15:04 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 15:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:01 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] * 14:59 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] (duration: 07m 48s) * 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92338 and previous config saved to /var/cache/conftool/dbconfig/20260505-145913-fceratto.json * 14:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:57 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 14:55 urbanecm@deploy1003: urbanecm: Continuing with deployment * 14:53 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92337 and previous config saved to /var/cache/conftool/dbconfig/20260505-145231-fceratto.json * 14:51 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] * 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92336 and previous config saved to /var/cache/conftool/dbconfig/20260505-144905-fceratto.json * 14:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1004.eqiad.wmnet with OS trixie * 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92335 and previous config saved to /var/cache/conftool/dbconfig/20260505-144223-fceratto.json * 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92334 and previous config saved to /var/cache/conftool/dbconfig/20260505-144029-fceratto.json * 14:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance * 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92333 and previous config saved to /var/cache/conftool/dbconfig/20260505-143958-fceratto.json * 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92332 and previous config saved to /var/cache/conftool/dbconfig/20260505-143214-fceratto.json * 14:30 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=eqiad * 14:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92331 and previous config saved to /var/cache/conftool/dbconfig/20260505-142949-fceratto.json * 14:28 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage * 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master1001.eqiad.wmnet * 14:25 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage * 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master1001.eqiad.wmnet * 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92329 and previous config saved to /var/cache/conftool/dbconfig/20260505-142206-fceratto.json * 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92328 and previous config saved to /var/cache/conftool/dbconfig/20260505-141941-fceratto.json * 14:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet * 14:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1004 * 14:10 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1004 * 14:10 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS trixie * 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1015.eqiad.wmnet * 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 14:09 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92327 and previous config saved to /var/cache/conftool/dbconfig/20260505-140933-fceratto.json * 14:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet * 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 14:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 14:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 14:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 14:05 eevans@cumin1003: START - Cookbook sre.dns.netbox * 14:05 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=eqiad * 14:05 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=codfw * 14:04 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync * 14:04 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync * 14:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 14:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 14:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 14:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 14:03 Lucas_WMDE: UTC afternoon backport+config window done * 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master2001.codfw.wmnet * 14:02 jasmine@cumin2002: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution. * 14:01 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1015.eqiad.wmnet * 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1014.eqiad.wmnet * 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 14:01 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92326 and previous config saved to /var/cache/conftool/dbconfig/20260505-140047-fceratto.json * 14:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance * 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92325 and previous config saved to /var/cache/conftool/dbconfig/20260505-140016-fceratto.json * 13:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: Repooling * 13:59 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master2001.codfw.wmnet * 13:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 13:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 13:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 13:55 eevans@cumin1003: START - Cookbook sre.dns.netbox * 13:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] (duration: 06m 22s) * 13:50 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1014.eqiad.wmnet * 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92323 and previous config saved to /var/cache/conftool/dbconfig/20260505-135008-fceratto.json * 13:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 13:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:49 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=codfw * 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1011.eqiad.wmnet * 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 13:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] * 13:47 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92321 and previous config saved to /var/cache/conftool/dbconfig/20260505-134522-fceratto.json * 13:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance * 13:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1227: Repooling * 13:44 eevans@cumin1003: START - Cookbook sre.dns.netbox * 13:43 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution. * 13:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92319 and previous config saved to /var/cache/conftool/dbconfig/20260505-134257-fceratto.json * 13:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance * 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92318 and previous config saved to /var/cache/conftool/dbconfig/20260505-134000-fceratto.json * 13:37 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1011.eqiad.wmnet * 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1010.eqiad.wmnet * 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 13:37 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003" * 13:33 eevans@cumin1003: START - Cookbook sre.dns.netbox * 13:30 Msz2001: UTC afternoon backport window done * 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92317 and previous config saved to /var/cache/conftool/dbconfig/20260505-132952-fceratto.json * 13:27 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1010.eqiad.wmnet * 13:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 13:23 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 13:23 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] (duration: 08m 37s) * 13:23 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 13:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 13:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on dborch1002.wikimedia.org with reason: [[phab:T416582|T416582]] * 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92316 and previous config saved to /var/cache/conftool/dbconfig/20260505-132002-fceratto.json * 13:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance * 13:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92315 and previous config saved to /var/cache/conftool/dbconfig/20260505-131931-fceratto.json * 13:19 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Continuing with deployment * 13:16 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug * 13:15 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] * 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet * 13:11 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] (duration: 07m 55s) * 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet * 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92314 and previous config saved to /var/cache/conftool/dbconfig/20260505-130923-fceratto.json * 13:07 mszwarc@deploy1003: mszwarc: Continuing with deployment * 13:05 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] * 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92313 and previous config saved to /var/cache/conftool/dbconfig/20260505-125915-fceratto.json * 12:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] (duration: 07m 23s) * 12:52 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 12:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:49 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] * 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92312 and previous config saved to /var/cache/conftool/dbconfig/20260505-124907-fceratto.json * 12:44 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] (duration: 03m 56s) * 12:43 sgimeno@deploy1003: sgimeno: Continuing with deployment * 12:42 moritzm: installing node-tar security updates * 12:41 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 12:40 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] * 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92311 and previous config saved to /var/cache/conftool/dbconfig/20260505-124041-fceratto.json * 12:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance * 12:36 moritzm: installing imagemagick security updates * 12:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2239.codfw.wmnet with reason: Maintenance * 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92310 and previous config saved to /var/cache/conftool/dbconfig/20260505-123411-fceratto.json * 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply * 12:31 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply * 12:29 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply * 12:28 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply * 12:26 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply * 12:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92309 and previous config saved to /var/cache/conftool/dbconfig/20260505-122404-fceratto.json * 12:23 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply * 12:23 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply * 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92308 and previous config saved to /var/cache/conftool/dbconfig/20260505-121352-fceratto.json * 12:04 moritzm: installing postgresql-13 security updates * 12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92307 and previous config saved to /var/cache/conftool/dbconfig/20260505-120344-fceratto.json * 11:57 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] (duration: 06m 13s) * 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2005.codfw.wmnet * 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92306 and previous config saved to /var/cache/conftool/dbconfig/20260505-115535-fceratto.json * 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance * 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92305 and previous config saved to /var/cache/conftool/dbconfig/20260505-115503-fceratto.json * 11:53 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 11:53 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 11:52 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2005.codfw.wmnet * 11:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] * 11:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] (duration: 09m 21s) * 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2004.codfw.wmnet * 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92304 and previous config saved to /var/cache/conftool/dbconfig/20260505-114455-fceratto.json * 11:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 11:43 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2004.codfw.wmnet * 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2003.codfw.wmnet * 11:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 11:38 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2003.codfw.wmnet * 11:38 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] * 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92303 and previous config saved to /var/cache/conftool/dbconfig/20260505-113446-fceratto.json * 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92302 and previous config saved to /var/cache/conftool/dbconfig/20260505-112449-fceratto.json * 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92301 and previous config saved to /var/cache/conftool/dbconfig/20260505-112438-fceratto.json * 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92300 and previous config saved to /var/cache/conftool/dbconfig/20260505-111616-fceratto.json * 11:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance * 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92299 and previous config saved to /var/cache/conftool/dbconfig/20260505-111545-fceratto.json * 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92298 and previous config saved to /var/cache/conftool/dbconfig/20260505-111435-fceratto.json * 11:10 moritzm: installing ca-certificates updates from bookworm point release * 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: after reimage to trixie * 11:07 moritzm: installing multipart bugfix updates from bookworm point release * 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92296 and previous config saved to /var/cache/conftool/dbconfig/20260505-110537-fceratto.json * 11:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica * 11:05 ayounsi@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica * 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92295 and previous config saved to /var/cache/conftool/dbconfig/20260505-110427-fceratto.json * 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1174: after reimage to trixie * 10:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92293 and previous config saved to /var/cache/conftool/dbconfig/20260505-105529-fceratto.json * 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92291 and previous config saved to /var/cache/conftool/dbconfig/20260505-105419-fceratto.json * 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'. * 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'. * 10:50 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'. * 10:50 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'. * 10:49 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'. * 10:49 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'. * 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92290 and previous config saved to /var/cache/conftool/dbconfig/20260505-104521-fceratto.json * 10:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92288 and previous config saved to /var/cache/conftool/dbconfig/20260505-104032-fceratto.json * 10:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance * 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92286 and previous config saved to /var/cache/conftool/dbconfig/20260505-103702-fceratto.json * 10:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance * 10:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92285 and previous config saved to /var/cache/conftool/dbconfig/20260505-103632-fceratto.json * 10:32 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 10:29 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92283 and previous config saved to /var/cache/conftool/dbconfig/20260505-102623-fceratto.json * 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2221: after reimage to trixie * 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 10:23 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . * 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 10:22 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . * 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2221.codfw.wmnet with OS trixie * 10:17 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . * 10:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92281 and previous config saved to /var/cache/conftool/dbconfig/20260505-101616-fceratto.json * 10:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1174: after reimage to trixie * 09:42 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . * 09:41 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply * 09:39 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply * 09:38 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply * 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P92271 and previous config saved to /var/cache/conftool/dbconfig/20260505-093703-fceratto.json * 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92270 and previous config saved to /var/cache/conftool/dbconfig/20260505-093619-fceratto.json * 09:36 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply * 09:35 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . * 09:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92269 and previous config saved to /var/cache/conftool/dbconfig/20260505-093305-fceratto.json * 09:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance * 09:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance * 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1174.eqiad.wmnet with OS trixie * 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2221.codfw.wmnet with OS trixie * 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1174: Reimage to Trixie * 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: Reimage to Trixie * 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1174: Reimage to Trixie * 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2221: Reimage to Trixie * 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Reimage to Trixie * 09:28 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . * 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Reimage to Trixie * 09:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92265 and previous config saved to /var/cache/conftool/dbconfig/20260505-092654-fceratto.json * 09:26 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply * 09:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply * 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92264 and previous config saved to /var/cache/conftool/dbconfig/20260505-092431-fceratto.json * 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92263 and previous config saved to /var/cache/conftool/dbconfig/20260505-091808-fceratto.json * 09:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance * 09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92262 and previous config saved to /var/cache/conftool/dbconfig/20260505-091423-fceratto.json * 09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance * 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92260 and previous config saved to /var/cache/conftool/dbconfig/20260505-091254-fceratto.json * 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92259 and previous config saved to /var/cache/conftool/dbconfig/20260505-090415-fceratto.json * 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92258 and previous config saved to /var/cache/conftool/dbconfig/20260505-090246-fceratto.json * 08:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2209: after reimage to trixie * 08:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92256 and previous config saved to /var/cache/conftool/dbconfig/20260505-085407-fceratto.json * 08:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2209.codfw.wmnet with OS trixie * 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92255 and previous config saved to /var/cache/conftool/dbconfig/20260505-085238-fceratto.json * 08:50 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org * 08:50 moritzm: installing augeas security updates * 08:49 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors * 08:48 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002" * 08:48 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002" * 08:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92254 and previous config saved to /var/cache/conftool/dbconfig/20260505-084616-fceratto.json * 08:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance * 08:42 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . * 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92253 and previous config saved to /var/cache/conftool/dbconfig/20260505-084231-fceratto.json * 08:41 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . * 08:40 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . * 08:38 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . * 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox * 08:37 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 08:37 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . * 08:35 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . * 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply * 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply * 08:34 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 13 hosts with reason: switches replacement * 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92252 and previous config saved to /var/cache/conftool/dbconfig/20260505-083356-fceratto.json * 08:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance * 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92251 and previous config saved to /var/cache/conftool/dbconfig/20260505-083326-fceratto.json * 08:32 jmm@cumin2002: START - Cookbook sre.dns.netbox * 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply * 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply * 08:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2209.codfw.wmnet with reason: host reimage * 08:29 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors * 08:28 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 08:28 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 08:24 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . * 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92250 and previous config saved to /var/cache/conftool/dbconfig/20260505-082318-fceratto.json * 08:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: after reimage to trixie * 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2209.codfw.wmnet with reason: host reimage * 08:16 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/x-flac # [[phab:T414641|T414641]] * 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: after reimage to trixie * 08:14 jmm@cumin2002: START - Cookbook sre.dns.netbox * 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org * 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92247 and previous config saved to /var/cache/conftool/dbconfig/20260505-081309-fceratto.json * 08:08 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/flac # [[phab:T414641|T414641]] * 08:05 ayounsi@dns1004: END - running authdns-update * 08:03 ayounsi@dns1004: START - running authdns-update * 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92245 and previous config saved to /var/cache/conftool/dbconfig/20260505-080301-fceratto.json * 08:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2209.codfw.wmnet with OS trixie * 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003" * 08:01 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003" * 08:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2209: Reimage to Trixie * 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2209: Reimage to Trixie * 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2209.codfw.wmnet with reason: Reimage to Trixie * 07:58 ayounsi@cumin1003: START - Cookbook sre.dns.netbox * 07:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2209 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92243 and previous config saved to /var/cache/conftool/dbconfig/20260505-075746-marostegui.json * 07:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2205 to s3 primary [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92242 and previous config saved to /var/cache/conftool/dbconfig/20260505-075654-marostegui.json * 07:55 awight: EU morning deployment was fun * 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92241 and previous config saved to /var/cache/conftool/dbconfig/20260505-075416-fceratto.json * 07:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance * 07:52 marostegui: Starting s3 codfw failover from db2209 to db2205 - [[phab:T424864|T424864]] * 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2205 with weight 0 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92239 and previous config saved to /var/cache/conftool/dbconfig/20260505-075156-marostegui.json * 07:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T424864|T424864]] * 07:50 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --force --mediatype AUDIO --mime audio/midi # [[phab:T414645|T414645]] * 07:45 zabe: zabe@deploy1003:~$ mwscript namespaceDupes.php scnwiki --fix # [[phab:T425378|T425378]] * 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2222: after reimage to trixie * 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2222.codfw.wmnet with OS trixie * 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1170: after reimage to trixie * 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS trixie * 07:11 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] (duration: 06m 43s) * 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2222.codfw.wmnet with reason: host reimage * 07:07 awight@deploy1003: awight, 1f616emo: Continuing with deployment * 07:06 awight@deploy1003: awight, 1f616emo: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:05 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] * 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage * 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org * 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2222.codfw.wmnet with reason: host reimage * 07:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1156: after reimage to trixie * 06:58 jmm@cumin2002: START - Cookbook sre.dns.netbox * 06:58 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org * 06:58 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage * 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS trixie * 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2222.codfw.wmnet with OS trixie * 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Reimage to Trixie * 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: Reimage to Trixie * 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Reimage to Trixie * 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Reimage to Trixie * 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2222: Reimage to Trixie * 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Reimage to Trixie * 06:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1156: after reimage to trixie * 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1156.eqiad.wmnet with OS trixie * 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage * 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage * 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003" * 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003 * 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003 * 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003" * 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1156.eqiad.wmnet with OS trixie * 05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1156: Reimage to Trixie * 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1156: Reimage to Trixie * 05:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1156.eqiad.wmnet with reason: Reimage to Trixie * 05:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie * 04:03 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.23 (duration: 03m 12s) * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002" * 01:21 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002" * 01:16 pt1979@cumin2002: START - Cookbook sre.dns.netbox * 00:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] (duration: 06m 50s) * 00:11 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 00:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] == 2026-05-04 == * 23:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]] * 23:14 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] (duration: 06m 45s) * 23:10 ladsgroup@deploy1003: neriah, ladsgroup: Continuing with deployment * 23:09 ladsgroup@deploy1003: neriah, ladsgroup: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] * 22:08 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 21:43 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 21:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 21:32 cwhite@deploy1003: Finished deploy [statsv/statsv@152de49]: fix logging (duration: 00m 11s) * 21:32 cwhite@deploy1003: Started deploy [statsv/statsv@152de49]: fix logging * 21:20 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] (duration: 11m 20s) * 21:16 cjming@deploy1003: cjming, neriah: Continuing with deployment * 21:10 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 21:09 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] * 20:38 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] (duration: 22m 19s) * 20:34 cjming@deploy1003: mmartorana, cjming: Continuing with deployment * 20:18 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:16 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] * 20:11 toyofuku@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] (duration: 07m 21s) * 20:07 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1005.eqiad.wmnet with OS trixie * 20:06 toyofuku@deploy1003: toyofuku: Continuing with deployment * 20:05 toyofuku@deploy1003: toyofuku: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 20:03 toyofuku@deploy1003: Started scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] * 19:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) asw1-22-ulsfo.wikimedia.org on all recursors * 19:50 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache asw1-22-ulsfo.wikimedia.org on all recursors * 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003" * 19:49 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003" * 19:48 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage * 19:44 ayounsi@cumin1003: START - Cookbook sre.dns.netbox * 19:42 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage * 19:40 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]] * 19:37 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]] * 19:28 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: ongoing troubleshooting * 19:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1005 * 19:27 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1005 * 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1005.eqiad.wmnet with OS trixie * 19:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'. * 19:23 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]] * 19:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'. * 19:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]] * 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 18:59 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] (duration: 06m 16s) * 18:55 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 18:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] * 18:31 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] (duration: 09m 17s) * 18:27 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 18:23 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 18:22 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] * 18:11 dancy@deploy1003: Finished scap sync-world: testing (duration: 02m 04s) * 18:11 dancy@deploy1003: dancy: Rolling back deployment * 18:10 dancy@deploy1003: dancy: testing synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 18:09 dancy@deploy1003: Started scap sync-world: testing * 18:08 dancy@deploy1003: Installation of scap version "4.260.0" completed for 2 hosts * 18:06 dancy@deploy1003: Installing scap version "4.260.0" for 2 host(s) * 17:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 17:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 16:40 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 16:39 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 16:34 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply * 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 16:33 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply * 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 16:04 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] (duration: 06m 19s) * 16:00 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 16:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] * 15:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92224 and previous config saved to /var/cache/conftool/dbconfig/20260504-155514-fceratto.json * 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92223 and previous config saved to /var/cache/conftool/dbconfig/20260504-154506-fceratto.json * 15:38 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] (duration: 06m 59s) * 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92222 and previous config saved to /var/cache/conftool/dbconfig/20260504-153458-fceratto.json * 15:34 ladsgroup@deploy1003: ladsgroup, chlod: Continuing with deployment * 15:33 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 39 hosts with reason: switches replacement * 15:33 ladsgroup@deploy1003: ladsgroup, chlod: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 15:32 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync * 15:32 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync * 15:31 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] * 15:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92221 and previous config saved to /var/cache/conftool/dbconfig/20260504-152449-fceratto.json * 15:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92220 and previous config saved to /var/cache/conftool/dbconfig/20260504-152238-fceratto.json * 15:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance * 15:20 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:17 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'. * 15:17 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'. * 15:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:13 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org * 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors * 15:13 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 15:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92219 and previous config saved to /var/cache/conftool/dbconfig/20260504-151238-fceratto.json * 15:10 papaul: ongoing switch refresh in ULSFO * 15:10 jmm@cumin2002: START - Cookbook sre.dns.netbox * 15:10 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) * 15:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 15:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] (duration: 06m 45s) * 15:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92218 and previous config saved to /var/cache/conftool/dbconfig/20260504-150230-fceratto.json * 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 15:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] * 14:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2001.codfw.wmnet with OS trixie * 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92217 and previous config saved to /var/cache/conftool/dbconfig/20260504-145222-fceratto.json * 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92216 and previous config saved to /var/cache/conftool/dbconfig/20260504-144213-fceratto.json * 14:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 7 hosts * 14:41 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for 7 hosts * 14:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage * 14:34 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage * 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92215 and previous config saved to /var/cache/conftool/dbconfig/20260504-143334-fceratto.json * 14:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance * 14:30 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cr[3-4]-ulsfo IPv6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPv6 with reason: switch refresh * 14:28 pt1979@cumin1003: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 4:00:00 on cr[3-4]-ulsfo IPV6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPV6 with reason: switch refresh * 14:25 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on asw2-ulsfo,cr[3-4]-ulsfo,mr1-ulsfo with reason: switch refresh * 14:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2001 * 14:16 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2001 * 14:13 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2001 * 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 14:13 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 14:13 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003" * 14:13 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003" * 14:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92214 and previous config saved to /var/cache/conftool/dbconfig/20260504-141113-fceratto.json * 14:07 herron@cumin1003: START - Cookbook sre.dns.netbox * 14:04 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2001 * 14:04 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2001.codfw.wmnet with OS trixie * 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92213 and previous config saved to /var/cache/conftool/dbconfig/20260504-140105-fceratto.json * 14:00 slyngshede@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]] * 14:00 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]] * 14:00 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]] * 13:59 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] (duration: 06m 22s) * 13:57 jmm@cumin2002: START - Cookbook sre.dns.netbox * 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors * 13:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors * 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 13:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002" * 13:55 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply * 13:55 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply * 13:55 sbisson@deploy1003: sbisson: Continuing with deployment * 13:55 sbisson@deploy1003: sbisson: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:54 dcausse: [[phab:T425301|T425301]]: stopping writes again on cloudelastic, cluster unstable * 13:53 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] * 13:52 jmm@cumin2002: START - Cookbook sre.dns.netbox * 13:52 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org * 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92212 and previous config saved to /var/cache/conftool/dbconfig/20260504-135056-fceratto.json * 13:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] (duration: 07m 30s) * 13:46 sbisson@deploy1003: 1f616emo, sbisson: Continuing with deployment * 13:45 sbisson@deploy1003: 1f616emo, sbisson: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 13:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] * 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92211 and previous config saved to /var/cache/conftool/dbconfig/20260504-134048-fceratto.json * 13:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92210 and previous config saved to /var/cache/conftool/dbconfig/20260504-133039-fceratto.json * 13:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance * 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92209 and previous config saved to /var/cache/conftool/dbconfig/20260504-133010-fceratto.json * 13:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:23 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92208 and previous config saved to /var/cache/conftool/dbconfig/20260504-132002-fceratto.json * 13:13 moritzm: installing jaraco.context security updates * 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5004.eqsin.wmnet * 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5004.eqsin.wmnet with OS bookworm * 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92207 and previous config saved to /var/cache/conftool/dbconfig/20260504-130953-fceratto.json * 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92206 and previous config saved to /var/cache/conftool/dbconfig/20260504-125945-fceratto.json * 12:59 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply * 12:59 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply * 12:59 dcausse: [[phab:T425301|T425301]]: resuming writes on cloudelastic * 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92205 and previous config saved to /var/cache/conftool/dbconfig/20260504-125247-fceratto.json * 12:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance * 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92204 and previous config saved to /var/cache/conftool/dbconfig/20260504-125219-fceratto.json * 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance * 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage * 12:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage * 12:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92203 and previous config saved to /var/cache/conftool/dbconfig/20260504-124210-fceratto.json * 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92202 and previous config saved to /var/cache/conftool/dbconfig/20260504-123203-fceratto.json * 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92201 and previous config saved to /var/cache/conftool/dbconfig/20260504-122155-fceratto.json * 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92200 and previous config saved to /var/cache/conftool/dbconfig/20260504-121441-fceratto.json * 12:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance * 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92199 and previous config saved to /var/cache/conftool/dbconfig/20260504-121424-fceratto.json * 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92198 and previous config saved to /var/cache/conftool/dbconfig/20260504-120416-fceratto.json * 12:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5004.eqsin.wmnet with OS bookworm * 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002" * 11:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002" * 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5004.eqsin.wmnet on all recursors * 11:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5004.eqsin.wmnet on all recursors * 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002" * 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92197 and previous config saved to /var/cache/conftool/dbconfig/20260504-115408-fceratto.json * 11:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002" * 11:47 jmm@cumin2002: START - Cookbook sre.dns.netbox * 11:47 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5004.eqsin.wmnet * 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5003.eqsin.wmnet * 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5003.eqsin.wmnet with OS bookworm * 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92196 and previous config saved to /var/cache/conftool/dbconfig/20260504-114400-fceratto.json * 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92195 and previous config saved to /var/cache/conftool/dbconfig/20260504-113620-fceratto.json * 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance * 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92194 and previous config saved to /var/cache/conftool/dbconfig/20260504-113550-fceratto.json * 11:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: after reimage to trixie * 11:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage * 11:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage * 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92192 and previous config saved to /var/cache/conftool/dbconfig/20260504-112542-fceratto.json * 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92191 and previous config saved to /var/cache/conftool/dbconfig/20260504-111534-fceratto.json * 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92189 and previous config saved to /var/cache/conftool/dbconfig/20260504-110526-fceratto.json * 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: repool after maintenance * 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92187 and previous config saved to /var/cache/conftool/dbconfig/20260504-105808-fceratto.json * 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance * 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92186 and previous config saved to /var/cache/conftool/dbconfig/20260504-105739-fceratto.json * 10:48 moritzm: installing bash updates from trixie point release * 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92184 and previous config saved to /var/cache/conftool/dbconfig/20260504-104731-fceratto.json * 10:42 moritzm: installing postgresql-17 security updates * 10:42 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1162: after reimage to trixie * 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS trixie * 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS bookworm * 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92181 and previous config saved to /var/cache/conftool/dbconfig/20260504-103723-fceratto.json * 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002" * 10:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002" * 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5003.eqsin.wmnet on all recursors * 10:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5003.eqsin.wmnet on all recursors * 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002" * 10:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002" * 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92179 and previous config saved to /var/cache/conftool/dbconfig/20260504-102715-fceratto.json * 10:26 jmm@cumin2002: START - Cookbook sre.dns.netbox * 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5003.eqsin.wmnet * 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92178 and previous config saved to /var/cache/conftool/dbconfig/20260504-101855-fceratto.json * 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance * 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92177 and previous config saved to /var/cache/conftool/dbconfig/20260504-101826-fceratto.json * 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2187: repool after maintenance * 10:16 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage * 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage * 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92174 and previous config saved to /var/cache/conftool/dbconfig/20260504-100818-fceratto.json * 10:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS trixie * 10:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reimage to Trixie * 10:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reimage to Trixie * 10:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1162.eqiad.wmnet with reason: Reimage to Trixie * 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92172 and previous config saved to /var/cache/conftool/dbconfig/20260504-095810-fceratto.json * 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org * 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92171 and previous config saved to /var/cache/conftool/dbconfig/20260504-094802-fceratto.json * 09:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org * 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92170 and previous config saved to /var/cache/conftool/dbconfig/20260504-093938-fceratto.json * 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance * 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92169 and previous config saved to /var/cache/conftool/dbconfig/20260504-093910-fceratto.json * 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 09:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1182: after reimage to trixie * 09:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92167 and previous config saved to /var/cache/conftool/dbconfig/20260504-092902-fceratto.json * 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92165 and previous config saved to /var/cache/conftool/dbconfig/20260504-091853-fceratto.json * 09:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Fixing events * 09:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Fixing events * 09:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2187.codfw.wmnet with reason: Checking events * 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92163 and previous config saved to /var/cache/conftool/dbconfig/20260504-090845-fceratto.json * 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92161 and previous config saved to /var/cache/conftool/dbconfig/20260504-085930-fceratto.json * 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance * 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92160 and previous config saved to /var/cache/conftool/dbconfig/20260504-085912-fceratto.json * 08:56 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1182: after reimage to trixie * 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92158 and previous config saved to /var/cache/conftool/dbconfig/20260504-084904-fceratto.json * 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1008.eqiad.wmnet * 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1007.eqiad.wmnet * 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1006.eqiad.wmnet * 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1005.eqiad.wmnet * 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1004.eqiad.wmnet * 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1003.eqiad.wmnet * 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1002.eqiad.wmnet * 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1001.eqiad.wmnet * 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1002.eqiad.wmnet * 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1001.eqiad.wmnet * 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92157 and previous config saved to /var/cache/conftool/dbconfig/20260504-083857-fceratto.json * 08:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS trixie * 08:32 moritzm: installing Linux 5.10.251-3 on bullseye hosts * 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92156 and previous config saved to /var/cache/conftool/dbconfig/20260504-082849-fceratto.json * 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet * 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92155 and previous config saved to /var/cache/conftool/dbconfig/20260504-082024-fceratto.json * 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance * 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance * 08:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet * 08:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage * 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage * 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply * 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply * 08:04 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync * 08:04 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] (duration: 07m 58s) * 08:03 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync * 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 08:02 gkyziridis@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: sync * 08:02 gkyziridis@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: sync * 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet * 08:01 moritzm: installing Linux 6.1.170 on bookworm hosts * 07:59 urbanecm@deploy1003: urbanecm, h2o: Continuing with deployment * 07:57 urbanecm@deploy1003: urbanecm, h2o: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1212: after reimage to trixie * 07:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] * 07:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet * 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS trixie * 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply * 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply * 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 07:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1182: Reimage to Trixie * 07:47 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1182: Reimage to Trixie * 07:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1182.eqiad.wmnet with reason: Reimage to Trixie * 07:44 dcausse: [[phab:T425301|T425301]]: stopping writes on cloudelastic * 07:44 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply * 07:44 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply * 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2147.codfw.wmnet * 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 07:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003" * 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2149: after reimage to trixie * 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1188: after reimage to trixie * 07:38 moritzm: installing Linux 6.12.85 on trixie hosts * 07:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo2003.codfw.wmnet * 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply * 07:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo2003.codfw.wmnet * 07:33 marostegui@cumin1003: START - Cookbook sre.dns.netbox * 07:28 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2147.codfw.wmnet * 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org * 07:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org * 07:11 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1212: after reimage to trixie * 07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1212.eqiad.wmnet with OS trixie * 06:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2149: after reimage to trixie * 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1188: after reimage to trixie * 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1188.eqiad.wmnet with OS trixie * 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2149.codfw.wmnet with OS trixie * 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage * 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage * 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage * 06:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2149.codfw.wmnet with reason: host reimage * 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1212.eqiad.wmnet with OS trixie * 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage * 06:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2149.codfw.wmnet with reason: host reimage * 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1212: Reimage to Trixie * 06:11 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1212: Reimage to Trixie * 06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1212.eqiad.wmnet with reason: Reimage to Trixie * 06:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s3 master: reimage to Debian Trixie * 06:09 marostegui: Reimage sanitarium master for s3, lag to be expected on wikireplicas for s3 [[phab:T424792|T424792]] * 06:05 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1188.eqiad.wmnet with OS trixie * 06:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1188: Reimage to Trixie * 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1188: Reimage to Trixie * 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1188.eqiad.wmnet with reason: Reimage to Trixie * 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2149.codfw.wmnet with OS trixie * 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2149: Reimage to Trixie * 05:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2149: Reimage to Trixie * 05:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2149.codfw.wmnet with reason: Reimage to Trixie * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image == 2026-05-03 == * 14:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] (duration: 10m 51s) * 14:05 ladsgroup@deploy1003: ladsgroup: Continuing with deployment * 14:04 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 14:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] * 12:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]] (duration: 29m 22s) * 11:58 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]] == 2026-05-02 == * 23:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] (duration: 06m 41s) * 23:28 zabe@deploy1003: dreamyjazz, zabe: Continuing with deployment * 23:27 zabe@deploy1003: dreamyjazz, zabe: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] * 23:22 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] (duration: 07m 27s) * 23:18 zabe@deploy1003: zabe, dreamyjazz: Continuing with deployment * 23:17 zabe@deploy1003: zabe, dreamyjazz: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 23:15 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] * 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie * 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie * 18:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2369.codfw.wmnet with OS trixie * 17:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:36 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage * 17:14 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage * 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2374.codfw.wmnet with OS trixie * 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2373.codfw.wmnet with OS trixie * 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:09 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2372.codfw.wmnet with OS trixie * 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2371.codfw.wmnet with OS trixie * 17:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2370.codfw.wmnet with OS trixie * 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 16:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage * 16:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 16:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage * 16:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 16:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage * 16:44 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage * 16:43 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage * 16:42 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage * 16:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage * 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage * 16:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage * 16:31 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2374.codfw.wmnet with OS trixie * 16:30 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2373.codfw.wmnet with OS trixie * 16:29 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2372.codfw.wmnet with OS trixie * 16:28 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage * 16:20 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie * 16:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2371.codfw.wmnet with OS trixie * 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie * 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2370.codfw.wmnet with OS trixie * 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2369.codfw.wmnet with OS trixie * 16:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2366.codfw.wmnet with OS trixie * 16:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2367.codfw.wmnet with OS trixie * 15:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2368.codfw.wmnet with OS trixie * 15:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage * 15:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage * 15:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage * 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage * 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage * 15:36 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage * 15:25 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie * 15:24 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie * 15:23 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie * 12:02 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] (duration: 13m 06s) * 11:57 samtar@deploy1003: samtar: Continuing with deployment * 11:50 samtar@deploy1003: samtar: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 11:49 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] * 09:20 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply * 09:19 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply * 09:19 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply * 09:19 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply * 09:19 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply * 09:19 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply * 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2366.codfw.wmnet with OS trixie * 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2368.codfw.wmnet with OS trixie * 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2367.codfw.wmnet with OS trixie * 02:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 02:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 02:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 02:12 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 31s) * 02:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 01:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage * 01:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage * 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage * 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage * 01:49 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage * 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie * 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie * 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie * 01:35 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2364.codfw.wmnet with OS trixie * 01:34 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 01:29 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2365.codfw.wmnet with OS trixie * 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 01:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2363.codfw.wmnet with OS trixie * 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 01:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 01:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage * 01:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage * 01:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage * 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage * 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage * 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage * 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2365.codfw.wmnet with OS trixie * 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2364.codfw.wmnet with OS trixie * 00:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2363.codfw.wmnet with OS trixie * 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2362.codfw.wmnet with OS trixie * 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 00:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2361.codfw.wmnet with OS trixie * 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 00:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2360.codfw.wmnet with OS trixie * 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 00:01 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" == 2026-05-01 == * 23:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage * 23:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage * 23:45 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage * 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage * 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage * 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage * 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2362.codfw.wmnet with OS trixie * 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2361.codfw.wmnet with OS trixie * 23:26 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2360.codfw.wmnet with OS trixie * 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2357.codfw.wmnet with OS trixie * 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 23:25 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2359.codfw.wmnet with OS trixie * 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 23:22 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2358.codfw.wmnet with OS trixie * 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 23:18 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002" * 23:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage * 23:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage * 23:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage * 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage * 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage * 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage * 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2359.codfw.wmnet with OS trixie * 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2358.codfw.wmnet with OS trixie * 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2357.codfw.wmnet with OS trixie * 22:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:19 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:15 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 22:02 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:57 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:50 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:49 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:46 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:21 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART * 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2374 * 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2374 * 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2373 * 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2373 * 20:59 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2372 * 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2372 * 20:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2371 * 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2371 * 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2370 * 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2370 * 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2369 * 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2369 * 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2368 * 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2368 * 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2367 * 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2367 * 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2366 * 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2366 * 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2365 * 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2365 * 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2364 * 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2364 * 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2363 * 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2363 * 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2362 * 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2362 * 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2361 * 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2361 * 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2360 * 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2360 * 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2359 * 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2359 * 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2358 * 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2358 * 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2357 * 20:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2357 * 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002" * 20:53 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002" * 20:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox * 20:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2002.codfw.wmnet with OS trixie * 20:06 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] (duration: 15m 27s) * 20:02 krinkle@deploy1003: krinkle: Continuing with deployment * 19:54 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage * 19:52 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 19:51 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] * 19:49 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage * 19:40 dancy@deploy1003: Finished scap sync-world: testing [[phab:T317405|T317405]] (duration: 03m 23s) * 19:37 dancy@deploy1003: Started scap sync-world: testing [[phab:T317405|T317405]] * 19:36 dancy@deploy1003: Installation of scap version "4.259.0" completed for 2 hosts * 19:34 dancy@deploy1003: Installing scap version "4.259.0" for 2 host(s) * 18:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'. * 18:55 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'. * 18:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Alangi Derick out of all services on: 2442 hosts * 18:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2002 * 18:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2002 * 18:41 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2002 * 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 18:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 18:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003" * 18:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003" * 18:36 herron@cumin1003: START - Cookbook sre.dns.netbox * 18:33 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2002 * 18:32 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2002.codfw.wmnet with OS trixie * 18:26 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2003.codfw.wmnet with OS trixie * 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage * 18:00 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage * 17:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2003 * 17:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2003 * 17:40 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2003 * 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 17:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 17:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003" * 17:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003" * 17:33 herron@cumin1003: START - Cookbook sre.dns.netbox * 17:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2003 * 17:28 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2003.codfw.wmnet with OS trixie * 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2004.codfw.wmnet with OS trixie * 16:34 cdobbins@cumin2002: conftool action : get/pooled; selector: name=cp5024.eqsin.wmnet * 16:30 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply * 16:30 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply * 16:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet * 16:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet * 16:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet * 15:59 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . * 15:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet * 15:47 dancy@deploy1003: Installation of scap version "4.258.1" completed for 2 hosts * 15:45 dancy@deploy1003: Installing scap version "4.258.1" for 2 host(s) * 15:34 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage * 15:30 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage * 15:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2004 * 15:14 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2004 * 15:11 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2004 * 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 15:11 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors * 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 15:11 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003" * 15:11 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003" * 15:05 dancy@deploy1003: Installation of scap version "4.258.0" completed for 2 hosts * 15:03 dancy@deploy1003: Installing scap version "4.258.0" for 2 host(s) * 14:57 herron@cumin1003: START - Cookbook sre.dns.netbox * 14:47 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2004 * 14:47 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2004.codfw.wmnet with OS trixie * 13:45 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply * 13:44 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply * 13:24 _Gerges: WikiMonitor setup * 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080 * 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078 * 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1079 * 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1077 * 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080 * 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079 * 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078 * 13:08 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077 * 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART * 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0) * 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003" * 13:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003" * 13:00 jclark@cumin1003: START - Cookbook sre.dns.netbox * 12:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply * 12:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply * 12:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply * 12:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply * 09:57 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] (duration: 06m 49s) * 09:53 samtar@deploy1003: samtar: Continuing with deployment * 09:52 samtar@deploy1003: samtar: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 09:50 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] * 09:38 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]] (duration: 06m 05s) * 09:32 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]] * 08:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply * 08:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply * 08:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply * 08:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply * 08:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply * 08:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply * 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply * 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s) * 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image * 00:16 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] (duration: 07m 05s) * 00:13 zabe@deploy1003: zabe: Continuing with deployment * 00:11 zabe@deploy1003: zabe: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. * 00:09 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] == Other archives == See [[Server Admin Log/Archives]]. <noinclude> [[Category:SAL]] [[Category:Operations]] </noinclude> 2nksuzfok9vb1jqe57halupq5p24tf3 Map of database maintenance 0 449160 2418624 2418610 2026-05-24T00:02:05Z Dexbot 30554 Bot: Updating the report 2418624 wikitext text/x-wiki {{/Header}} == Today (2026-05-24) == == Yesterday (2026-05-23) == == Last seven days == {| class="wikitable" |+ eqiad |- ! Section !! Work |- | es6 || [[phab:T426633|Login (T426633)]] (fceratto) |- | es7 || [[phab:T426633|Login (T426633)]] (fceratto) |- | s5 || [[phab:T426087|Switchover s5 master (db1210 -&gt; db1230) (T426087)]] (fceratto) |- |} {| class="wikitable" |+ codfw |- ! Section !! Work |- | es6 || [[phab:T426633|Login (T426633)]] (fceratto) |- | es7 || [[phab:T426633|Login (T426633)]] (fceratto) |- | pc1 || [[phab:T418973|Productionize pc20[21-24] and pc10[21-24] (T418973)]] (marostegui) |- | pc2 || * [[phab:T418973|Productionize pc20[21-24] and pc10[21-24] (T418973)]] (marostegui) * [[phab:T421705|Move mariadb hosts to nftables (T421705)]] (ladsgroup) |- | pc4 || [[phab:T418973|Productionize pc20[21-24] and pc10[21-24] (T418973)]] (marostegui) |- | s1 || [[phab:T426703|Switchover s1 master (db2212 -&gt; db2203) (T426703)]] (fceratto) |- | s4 || * [[phab:T419635|Drop il_to column from imagelinks table in wmf production (T419635)]] (fceratto) * [[phab:T426590|Switchover s4 master (db2179 -&gt; db2240) (T426590)]] (fceratto) |- | s5 || [[phab:T426600|Switchover s5 master (db2192 -&gt; db2213) (T426600)]] (fceratto) |- | x3 || [[phab:T426936|Switchover x3 master (db2241 -&gt; db2162) (T426936)]] (cwilliams) |- |} [[Category:MariaDB]] j4vpmt98u6n48d8t7ri2p3m4p0y5d30 Tool:Gitlab-account-approval/Log 116 453906 2418619 2418616 2026-05-23T13:09:30Z Gitlabaccountapprovalbot 37332 @gauthammohanraj was approved. 2418619 wikitext text/x-wiki <noinclude>'''Audit log of approvals''' made by [[gitlab:gitlabaccountapprovalbot|@gitlabaccountapprovalbot]]. __NOTOC__</noinclude> === 2026-05-23 === * 13:09 [[gitlab:gauthammohanraj|@gauthammohanraj]] was approved. * 04:21 [[gitlab:staraction|@staraction]] was approved. === 2026-05-22 === * 19:03 "i-horich" was rejected (pending since 2026-02-20T19:00:43.519Z). * 01:48 "50323233" was rejected (pending since 2026-02-20T01:48:05.555Z). === 2026-05-21 === * 18:51 "kartikeyg0104" was rejected (pending since 2026-02-19T18:48:39.707Z). * 16:27 [[gitlab:renovatebot|@renovatebot]] was approved. * 16:06 [[gitlab:gkm563|@gkm563]] was approved. === 2026-05-20 === * 01:21 "beedellrokejulianlockhart" was rejected (pending since 2026-02-18T01:19:13.284Z). === 2026-05-18 === * 23:18 "wladek92" was rejected (pending since 2026-02-16T23:16:22.939Z). * 16:36 [[gitlab:effeietsanders|@effeietsanders]] was approved. === 2026-05-14 === * 21:00 [[gitlab:nehemienathan|@nehemienathan]] was approved. === 2026-05-13 === * 10:51 "ssssaaaa" was rejected (pending since 2026-02-11T10:50:36.975Z). === 2026-05-12 === * 18:06 [[gitlab:psubhashish|@psubhashish]] was approved. * 08:12 "khan" was rejected (pending since 2026-02-10T08:11:48.776Z). * 04:27 "galaxysh" was rejected (pending since 2026-02-10T04:24:59.440Z). === 2026-05-11 === * 12:18 "peterxy12" was rejected (pending since 2026-02-09T12:18:01.982Z). === 2026-05-10 === * 11:09 "yalihupokn" was rejected (pending since 2026-02-08T11:06:51.336Z). * 05:12 "wobadha" was rejected (pending since 2026-02-08T05:11:00.569Z). === 2026-05-09 === * 13:45 "bwiki" was rejected (pending since 2026-02-07T13:43:38.177Z). === 2026-05-08 === * 09:24 [[gitlab:cwilliams|@cwilliams]] was approved. === 2026-05-07 === * 14:15 "rehankhan78" was rejected (pending since 2026-02-05T14:13:37.754Z). === 2026-05-06 === * 11:24 "ari" was rejected (pending since 2026-02-04T11:24:11.760Z). * 08:09 [[gitlab:neriah|@neriah]] was approved. * 06:27 [[gitlab:status401|@status401]] was approved. === 2026-05-03 === * 09:54 [[gitlab:anilk|@anilk]] was approved. === 2026-05-02 === * 17:54 [[gitlab:sweil|@sweil]] was approved. * 17:00 [[gitlab:aoppo|@aoppo]] was approved. === 2026-05-01 === * 21:18 [[gitlab:dawalda|@dawalda]] was approved. === 2026-04-30 === * 21:42 "merohibine" was rejected (pending since 2026-01-29T21:40:00.756Z). * 20:54 [[gitlab:tfmorris|@tfmorris]] was approved. * 17:33 [[gitlab:uyen|@uyen]] was approved. * 07:39 [[gitlab:mahveotm|@mahveotm]] was approved. * 06:36 [[gitlab:leo321|@leo321]] was approved. === 2026-04-29 === * 02:27 [[gitlab:dw31415|@dw31415]] was approved. === 2026-04-28 === * 23:09 [[gitlab:dtorsani|@dtorsani]] was approved. === 2026-04-27 === * 23:42 [[gitlab:quinlan|@quinlan]] was approved. * 05:00 [[gitlab:matthewyeager|@matthewyeager]] was approved. === 2026-04-26 === * 17:36 "kuba-hajnej" was rejected (pending since 2026-01-25T17:33:32.467Z). * 13:03 "jklamo" was rejected (pending since 2026-01-25T13:02:22.936Z). === 2026-04-25 === * 20:24 [[gitlab:maldaxura|@maldaxura]] was approved. * 14:33 [[gitlab:sirtobi|@sirtobi]] was approved. * 04:18 "ice5678" was rejected (pending since 2026-01-24T04:15:30.008Z). === 2026-04-24 === * 22:06 [[gitlab:arcstur|@arcstur]] was approved. === 2026-04-22 === * 23:06 "dtorsani" was rejected (pending since 2026-01-21T23:03:25.843Z). * 22:18 [[gitlab:egezort|@egezort]] was approved. * 16:45 "nexpectarpit" was rejected (pending since 2026-01-21T16:43:21.045Z). === 2026-04-20 === * 19:15 "fitch" was rejected (pending since 2026-01-19T19:12:35.644Z). === 2026-04-19 === * 02:54 [[gitlab:neoact|@neoact]] was approved. === 2026-04-18 === * 07:06 [[gitlab:kockaadmiralac|@kockaadmiralac]] was approved. === 2026-04-17 === * 13:42 "liselot" was rejected (pending since 2026-01-16T13:39:41.909Z). === 2026-04-15 === * 17:03 "lahari" was rejected (pending since 2026-01-14T17:02:06.275Z). === 2026-04-14 === * 13:00 "surajseth520" was rejected (pending since 2026-01-13T12:59:45.906Z). * 04:51 [[gitlab:canley|@canley]] was approved. * 01:03 "bshizzle" was rejected (pending since 2026-01-13T01:00:48.120Z). === 2026-04-13 === * 15:30 [[gitlab:passimacopoulos|@passimacopoulos]] was approved. === 2026-04-11 === * 12:30 "krithash" was rejected (pending since 2026-01-10T12:27:24.731Z). === 2026-04-10 === * 15:30 "raunak1709" was rejected (pending since 2026-01-09T15:29:10.901Z). === 2026-04-07 === * 17:03 [[gitlab:supnabla|@supnabla]] was approved. === 2026-04-06 === * 20:00 [[gitlab:laerdon|@laerdon]] was approved. * 19:21 [[gitlab:ljq3|@ljq3]] was approved. === 2026-04-04 === * 11:06 "mixcc" was rejected (pending since 2026-01-03T11:03:33.922Z). === 2026-04-02 === * 05:30 [[gitlab:mbh1|@mbh1]] was approved. === 2026-04-01 === * 18:21 "yuvrajpatil17" was rejected (pending since 2025-12-31T18:20:27.991Z). * 12:12 [[gitlab:amorii0|@amorii0]] was approved. === 2026-03-31 === * 11:00 "krrishsehgal" was rejected (pending since 2025-12-30T11:00:16.384Z). === 2026-03-30 === * 15:36 [[gitlab:atsuko|@atsuko]] was approved. === 2026-03-29 === * 11:36 [[gitlab:giftcup|@giftcup]] was approved. === 2026-03-28 === * 14:51 [[gitlab:janeeva1|@janeeva1]] was approved. === 2026-03-26 === * 13:36 [[gitlab:saiphani02|@saiphani02]] was approved. * 11:48 [[gitlab:valerioboz-wmch|@valerioboz-wmch]] was approved. === 2026-03-25 === * 09:45 "quansi" was rejected (pending since 2025-12-24T09:42:13.451Z). * 02:18 [[gitlab:viztor|@viztor]] was approved. === 2026-03-24 === * 23:18 [[gitlab:maryyann|@maryyann]] was approved. * 23:01 [[gitlab:codenamenoreste|@codenamenoreste]] was approved. * 13:36 [[gitlab:marc-maillard-wmse|@marc-maillard-wmse]] was approved. * 07:39 "fred2675" was rejected (pending since 2025-12-23T07:39:11.380Z). === 2026-03-23 === * 14:51 [[gitlab:komla|@komla]] was approved. * 05:51 "lunachuck43" was rejected (pending since 2025-12-22T05:50:17.862Z). * 04:06 "reza110011" was rejected (pending since 2025-12-22T04:05:25.117Z). === 2026-03-20 === * 21:54 "mertgor" was rejected (pending since 2025-12-19T21:51:51.419Z). * 20:57 "autanmahmah" was rejected (pending since 2025-12-19T20:54:51.678Z). * 09:57 [[gitlab:nethahussain|@nethahussain]] was approved. * 09:27 [[gitlab:piewriter|@piewriter]] was approved. * 08:15 [[gitlab:dondersmooi|@dondersmooi]] was approved. === 2026-03-19 === * 21:03 "sayvhior" was rejected (pending since 2025-12-18T21:02:31.699Z). === 2026-03-18 === * 20:15 [[gitlab:martinmystere|@martinmystere]] was approved. === 2026-03-17 === * 02:51 "louperivois" was rejected (pending since 2025-12-16T02:50:48.197Z). === 2026-03-16 === * 12:54 "mokayaj857" was rejected (pending since 2025-12-15T12:53:39.015Z). * 06:18 "roamer15" was rejected (pending since 2025-12-15T06:16:38.042Z). === 2026-03-14 === * 11:12 "umaramuhammad" was rejected (pending since 2025-12-13T11:10:44.004Z). * 09:33 "akuma19" was rejected (pending since 2025-12-13T09:31:39.044Z). * 07:06 [[gitlab:syunsyunminmin|@syunsyunminmin]] was approved. === 2026-03-12 === * 20:24 [[gitlab:11wb|@11wb]] was approved. * 09:54 [[gitlab:bcxfu75k|@bcxfu75k]] was approved. === 2026-03-10 === * 09:12 [[gitlab:viktoriahillerudwmse|@viktoriahillerudwmse]] was approved. === 2026-03-06 === * 08:09 "vazhayilnewone" was rejected (pending since 2025-12-05T08:07:02.184Z). === 2026-03-04 === * 20:54 [[gitlab:elphie|@elphie]] was approved. * 11:39 "ronaldahmed" was rejected (pending since 2025-12-03T11:37:47.492Z). * 02:12 "ltslw" was rejected (pending since 2025-12-03T02:11:52.040Z). === 2026-03-02 === * 19:21 "dlopez350" was rejected (pending since 2025-12-01T19:20:38.918Z). * 18:15 [[gitlab:lsandergreen|@lsandergreen]] was approved. === 2026-03-01 === * 10:51 [[gitlab:clintacc|@clintacc]] was approved. === 2026-02-28 === * 09:24 "cardboardlamp" was rejected (pending since 2025-11-29T09:22:03.947Z). * 08:18 "wiki-pavan" was rejected (pending since 2025-11-29T08:16:24.184Z). === 2026-02-27 === * 20:45 "thisisrick25" was rejected (pending since 2025-11-28T20:42:24.454Z). === 2026-02-26 === * 13:57 "chuiimuiiofc" was rejected (pending since 2025-11-27T13:57:02.794Z). * 13:54 "steffpro" was rejected (pending since 2025-11-27T13:52:10.859Z). === 2026-02-25 === * 21:24 "abubakarhabibudayyabu" was rejected (pending since 2025-11-26T21:22:37.776Z). === 2026-02-24 === * 05:00 "playboi" was rejected (pending since 2025-11-25T05:00:30.762Z). === 2026-02-23 === * 14:00 "alph65" was rejected (pending since 2025-11-24T13:59:00.797Z). * 12:33 [[gitlab:robertsky|@robertsky]] was approved. === 2026-02-22 === * 00:30 "hp8p" was rejected (pending since 2025-11-23T00:29:24.741Z). === 2026-02-19 === * 16:45 "clayjar" was rejected (pending since 2025-11-20T16:44:48.380Z). === 2026-02-18 === * 22:18 "nexus" was rejected (pending since 2025-11-19T22:16:48.818Z). * 12:00 "bernsteinnn" was rejected (pending since 2025-11-19T11:59:04.427Z). === 2026-02-17 === * 11:36 "jason2000-cpu" was rejected (pending since 2025-11-18T11:34:00.314Z). === 2026-02-16 === * 14:54 "smaurya" was rejected (pending since 2025-11-17T14:52:06.906Z). === 2026-02-15 === * 16:51 "kra-79" was rejected (pending since 2025-11-16T16:50:41.375Z). === 2026-02-14 === * 15:15 [[gitlab:mess|@mess]] was approved. === 2026-02-13 === * 13:57 "sopalsuemae957" was rejected (pending since 2025-11-14T13:55:16.921Z). * 13:30 [[gitlab:wyslijp16-toolforge|@wyslijp16-toolforge]] was approved. === 2026-02-12 === * 16:30 "kristinagligoric" was rejected (pending since 2025-11-13T16:29:21.646Z). * 03:33 [[gitlab:anyehansen|@anyehansen]] was approved. * 02:21 [[gitlab:thejoyfultentmaker|@thejoyfultentmaker]] was approved. === 2026-02-10 === * 13:18 [[gitlab:db111|@db111]] was approved. === 2026-02-09 === * 19:06 "squirrel289" was rejected (pending since 2025-11-10T19:04:27.831Z). === 2026-02-06 === * 20:54 [[gitlab:gillux|@gillux]] was approved. * 09:09 [[gitlab:lih|@lih]] was approved. === 2026-01-31 === * 16:21 [[gitlab:taxonbot1|@taxonbot1]] was approved. === 2026-01-28 === * 14:30 [[gitlab:ademola|@ademola]] was approved. * 10:51 "watshell" was rejected (pending since 2025-10-29T10:51:01.521Z). === 2026-01-26 === * 23:06 "tavaresgmg" was rejected (pending since 2025-10-27T23:04:42.140Z). === 2026-01-25 === * 06:03 "cata" was rejected (pending since 2025-10-26T06:01:26.155Z). === 2026-01-24 === * 21:15 [[gitlab:wiegels|@wiegels]] was approved. * 06:30 [[gitlab:blaquans|@blaquans]] was approved. === 2026-01-23 === * 16:27 [[gitlab:lerickson|@lerickson]] was approved. * 10:15 "fran0035g" was rejected (pending since 2025-10-24T10:12:17.732Z). === 2026-01-22 === * 21:00 "hacksyn" was rejected (pending since 2025-10-23T20:59:15.982Z). === 2026-01-21 === * 17:30 [[gitlab:otcenas11|@otcenas11]] was approved. === 2026-01-19 === * 21:48 [[gitlab:amdrel|@amdrel]] was approved. * 04:36 "rayalexa" was rejected (pending since 2025-10-20T04:35:02.094Z). === 2026-01-18 === * 15:45 "somya" was rejected (pending since 2025-10-19T15:43:43.701Z). * 06:54 "sergg001" was rejected (pending since 2025-10-19T06:54:12.296Z). === 2026-01-16 === * 11:57 "zeejohsy" was rejected (pending since 2025-10-17T11:56:22.372Z). * 04:45 "rocky25" was rejected (pending since 2025-10-17T04:43:33.180Z). === 2026-01-15 === * 16:39 "tiisu" was rejected (pending since 2025-10-16T16:37:18.438Z). * 12:00 "noahalorwu" was rejected (pending since 2025-10-16T11:58:26.133Z). * 10:39 "prjayaiuedu" was rejected (pending since 2025-10-16T10:37:16.947Z). === 2026-01-13 === * 17:21 [[gitlab:lwilson-ctr|@lwilson-ctr]] was approved. === 2026-01-12 === * 17:03 "stagietechs" was rejected (pending since 2025-10-13T17:02:25.281Z). === 2026-01-10 === * 19:06 "keerthisr" was rejected (pending since 2025-10-11T19:05:01.758Z). === 2026-01-09 === * 20:36 "lightb" was rejected (pending since 2025-10-10T20:34:20.264Z). === 2026-01-08 === * 19:42 [[gitlab:tbodt|@tbodt]] was approved. * 13:57 [[gitlab:martynranyard|@martynranyard]] was approved. === 2026-01-07 === * 17:48 [[gitlab:santanuwiki25|@santanuwiki25]] was approved. * 14:27 "dipanshu" was rejected (pending since 2025-10-08T14:26:10.794Z). * 12:30 "adeolaadesina" was rejected (pending since 2025-10-08T12:29:49.592Z). * 09:21 "tony-kamande" was rejected (pending since 2025-10-08T09:20:28.421Z). * 06:18 "hninwuttyi" was rejected (pending since 2025-10-08T06:17:28.006Z). * 05:09 "andume" was rejected (pending since 2025-10-08T05:07:18.582Z). * 02:00 "mosope" was rejected (pending since 2025-10-08T01:59:54.800Z). * 01:15 [[gitlab:tungstalite|@tungstalite]] was approved. === 2026-01-06 === * 18:24 "leerensucher" was rejected (pending since 2025-10-07T18:21:41.253Z). * 14:54 "leonidlednev" was rejected (pending since 2025-10-07T14:53:07.273Z). * 12:57 "alexandre-tingaud" was rejected (pending since 2025-10-07T12:54:27.206Z). === 2026-01-04 === * 21:33 [[gitlab:matr1x-101|@matr1x-101]] was approved. * 15:18 "makjr" was rejected (pending since 2025-10-05T15:16:31.558Z). * 14:09 "dakshq" was rejected (pending since 2025-10-05T14:08:40.608Z). === 2026-01-03 === * 20:42 [[gitlab:apehitkey|@apehitkey]] was approved. * 18:00 [[gitlab:jeremyb|@jeremyb]] was approved. * 14:09 [[gitlab:twelephant|@twelephant]] was approved. === 2026-01-01 === * 11:30 "shellstanislav" was rejected (pending since 2025-10-02T11:29:10.150Z). === 2025-12-30 === * 19:51 "camilojdiaz" was rejected (pending since 2025-09-30T19:49:24.913Z). === 2025-12-29 === * 16:03 "zied" was rejected (pending since 2025-09-29T16:01:30.415Z). * 08:18 "rahulsidpradhan" was rejected (pending since 2025-09-29T08:17:02.849Z). === 2025-12-26 === * 09:48 "thembo42" was rejected (pending since 2025-09-26T09:45:15.033Z). === 2025-12-25 === * 14:03 "196936074751" was rejected (pending since 2025-09-25T14:02:31.367Z). === 2025-12-23 === * 16:21 "ngarnsworthy" was rejected (pending since 2025-09-23T16:20:41.211Z). === 2025-12-22 === * 12:39 "aza555" was rejected (pending since 2025-09-22T12:38:02.622Z). === 2025-12-20 === * 23:45 "saph" was rejected (pending since 2025-09-20T23:45:01.222Z). === 2025-12-19 === * 10:15 "vladdymoses" was rejected (pending since 2025-09-19T10:15:00.999Z). * 07:15 "dirtylittlepoobah" was rejected (pending since 2025-09-19T07:13:55.537Z). === 2025-12-18 === * 16:24 [[gitlab:guyfawcus|@guyfawcus]] was approved. === 2025-12-17 === * 21:39 [[gitlab:holdyourhorses|@holdyourhorses]] was approved. * 18:30 "prudencia" was rejected (pending since 2025-09-17T18:27:18.860Z). * 02:24 "lottie" was rejected (pending since 2025-09-17T02:21:21.744Z). === 2025-12-16 === * 09:39 [[gitlab:melcatherine|@melcatherine]] was approved. * 08:54 [[gitlab:leila237|@leila237]] was approved. === 2025-12-15 === * 18:27 [[gitlab:royalsailor|@royalsailor]] was approved. * 09:39 [[gitlab:olaf8940|@olaf8940]] was approved. * 09:39 "brianbybyby" was rejected (pending since 2025-09-15T09:37:45.430Z). === 2025-12-14 === * 20:21 [[gitlab:essa237|@essa237]] was approved. * 16:42 [[gitlab:bovimacoco|@bovimacoco]] was approved. === 2025-12-13 === * 21:54 "mmns21" was rejected (pending since 2025-09-13T21:52:24.017Z). * 20:33 "bugcrawler" was rejected (pending since 2025-09-13T20:31:09.211Z). === 2025-12-12 === * 14:39 "ruvchoudhary" was rejected (pending since 2025-09-12T14:36:16.167Z). * 06:54 "rezadress" was rejected (pending since 2025-09-12T06:52:21.749Z). === 2025-12-10 === * 17:30 [[gitlab:itsmoon|@itsmoon]] was approved. === 2025-12-09 === * 15:42 [[gitlab:mercy-o|@mercy-o]] was approved. === 2025-12-06 === * 16:45 "jacquesradjabu" was rejected (pending since 2025-09-06T16:45:17.969Z). * 11:27 [[gitlab:ikhitron|@ikhitron]] was approved. === 2025-12-01 === * 08:12 "halconmilenario21" was rejected (pending since 2025-09-01T08:12:10.262Z). === 2025-11-30 === * 21:06 [[gitlab:habs|@habs]] was approved. === 2025-11-29 === * 16:36 "bovimacoco" was rejected (pending since 2025-08-30T16:34:39.712Z). * 00:45 [[gitlab:jjpmaster|@jjpmaster]] was approved. === 2025-11-24 === * 10:30 "alph65" was rejected (pending since 2025-08-25T10:28:40.957Z). * 02:24 [[gitlab:yaron|@yaron]] was approved. === 2025-11-20 === * 16:06 "clayjar" was rejected (pending since 2025-08-21T16:04:54.450Z). === 2025-11-17 === * 21:09 [[gitlab:ankita97531|@ankita97531]] was approved. === 2025-11-16 === * 14:15 "commanderkefir" was rejected (pending since 2025-08-17T14:13:14.791Z). * 08:21 "rehankhan78" was rejected (pending since 2025-08-17T08:19:44.896Z). === 2025-11-15 === * 14:36 "cyberscribe" was rejected (pending since 2025-08-16T14:34:27.230Z). === 2025-11-13 === * 04:21 "waddie96" was rejected (pending since 2025-08-14T04:19:27.461Z). === 2025-11-11 === * 06:42 [[gitlab:seanhoyland|@seanhoyland]] was approved. === 2025-11-10 === * 00:06 [[gitlab:jaredblumer|@jaredblumer]] was approved. === 2025-11-09 === * 22:36 "heinxiety" was rejected (pending since 2025-08-10T22:33:12.041Z). === 2025-11-07 === * 22:00 [[gitlab:forzagreen|@forzagreen]] was approved. === 2025-11-06 === * 16:57 [[gitlab:rsilvola|@rsilvola]] was approved. === 2025-11-04 === * 21:24 [[gitlab:devdoingdev|@devdoingdev]] was approved. === 2025-11-03 === * 17:48 "joewaleed98" was rejected (pending since 2025-08-04T17:46:12.191Z). === 2025-11-01 === * 18:00 "eliasempresas" was rejected (pending since 2025-08-02T17:58:04.412Z). === 2025-10-31 === * 18:51 [[gitlab:chaoticenby|@chaoticenby]] was approved. * 04:33 "3ch310n" was rejected (pending since 2025-08-01T04:32:21.982Z). === 2025-10-30 === * 10:03 [[gitlab:tausheefhassan|@tausheefhassan]] was approved. === 2025-10-29 === * 14:54 "theap" was rejected (pending since 2025-07-30T14:52:12.066Z). === 2025-10-28 === * 06:06 [[gitlab:tanbiruzzaman|@tanbiruzzaman]] was approved. === 2025-10-27 === * 07:51 [[gitlab:jmoore111|@jmoore111]] was approved. === 2025-10-25 === * 21:09 [[gitlab:valor|@valor]] was approved. * 21:03 [[gitlab:booksmurf|@booksmurf]] was approved. * 02:48 "mystyc1" was rejected (pending since 2025-07-26T02:46:19.373Z). === 2025-10-24 === * 05:12 "aadarshmahesh" was rejected (pending since 2025-07-25T05:09:38.264Z). === 2025-10-22 === * 20:54 [[gitlab:janewanga|@janewanga]] was approved. * 17:27 "abeljeevan" was rejected (pending since 2025-07-23T17:26:46.884Z). * 16:12 "shrimpnaur" was rejected (pending since 2025-07-23T16:10:37.864Z). === 2025-10-21 === * 18:51 "jrmuizel" was rejected (pending since 2025-07-22T18:50:07.315Z). * 09:33 [[gitlab:dpogorzelski|@dpogorzelski]] was approved. === 2025-10-17 === * 13:21 [[gitlab:blegodwin|@blegodwin]] was approved. === 2025-10-16 === * 14:51 [[gitlab:bahago|@bahago]] was approved. * 14:12 "harikrishna0005" was rejected (pending since 2025-07-17T14:10:48.385Z). * 14:09 "gauthammohanraj" was rejected (pending since 2025-07-17T14:08:47.643Z). === 2025-10-15 === * 13:48 [[gitlab:adwivedii|@adwivedii]] was approved. * 13:18 [[gitlab:kimbrenekakande|@kimbrenekakande]] was approved. * 13:03 "childmnajennifer" was rejected (pending since 2025-07-16T13:01:50.236Z). * 05:06 "vssb4214" was rejected (pending since 2025-07-16T05:05:33.985Z). === 2025-10-14 === * 19:39 [[gitlab:afanyulionel|@afanyulionel]] was approved. * 15:33 [[gitlab:sadrettin|@sadrettin]] was approved. * 14:18 [[gitlab:tmwyk|@tmwyk]] was approved. * 08:42 "yasu0796" was rejected (pending since 2025-07-15T08:41:26.453Z). === 2025-10-13 === * 16:09 [[gitlab:atlas0007|@atlas0007]] was approved. === 2025-10-11 === * 17:42 [[gitlab:techwizzie|@techwizzie]] was approved. === 2025-10-10 === * 19:03 [[gitlab:miiswom|@miiswom]] was approved. * 16:06 [[gitlab:ninatakang|@ninatakang]] was approved. === 2025-10-09 === * 15:42 [[gitlab:jaykaneki|@jaykaneki]] was approved. * 14:21 [[gitlab:lebogang|@lebogang]] was approved. * 14:15 [[gitlab:kimondorose|@kimondorose]] was approved. * 13:48 [[gitlab:joyakinyi|@joyakinyi]] was approved. * 13:48 [[gitlab:dikshyashahi|@dikshyashahi]] was approved. * 13:45 [[gitlab:obediobadiah|@obediobadiah]] was approved. * 13:45 [[gitlab:system625|@system625]] was approved. * 13:45 [[gitlab:rolalove|@rolalove]] was approved. * 13:39 [[gitlab:olatundeawo|@olatundeawo]] was approved. * 13:36 [[gitlab:danielchristlight|@danielchristlight]] was approved. * 13:36 [[gitlab:dipanshu1223|@dipanshu1223]] was approved. * 13:36 [[gitlab:aradhya|@aradhya]] was approved. * 09:57 "bognd" was rejected (pending since 2025-07-10T09:55:48.661Z). === 2025-10-08 === * 23:36 [[gitlab:sopzy|@sopzy]] was approved. * 23:03 [[gitlab:oluwatumininu|@oluwatumininu]] was approved. * 19:39 [[gitlab:levon003|@levon003]] was approved. * 15:24 [[gitlab:ritika-bhambri11|@ritika-bhambri11]] was approved. * 13:45 [[gitlab:anbanguyen|@anbanguyen]] was approved. * 13:36 [[gitlab:chumzine|@chumzine]] was approved. * 13:27 [[gitlab:shr0x-ya|@shr0x-ya]] was approved. * 12:45 [[gitlab:nurahwakili|@nurahwakili]] was approved. * 03:42 "nazhiba" was rejected (pending since 2025-07-09T03:40:12.625Z). * 02:12 "mafennel" was rejected (pending since 2025-07-09T02:11:40.598Z). === 2025-10-07 === * 22:54 [[gitlab:olusegunfaj|@olusegunfaj]] was approved. * 21:30 [[gitlab:rona|@rona]] was approved. * 21:09 [[gitlab:sandijigs|@sandijigs]] was approved. * 13:36 "xisbajao" was rejected (pending since 2025-07-08T13:33:35.018Z). * 01:36 "areczek94" was rejected (pending since 2025-07-08T01:35:40.633Z). === 2025-10-06 === * 19:21 "wmcarter2017" was rejected (pending since 2025-07-07T19:21:12.899Z). === 2025-10-05 === * 14:15 "meetmendapara" was rejected (pending since 2025-07-06T14:14:16.726Z). === 2025-10-04 === * 20:51 "nftbaee" was rejected (pending since 2025-07-05T20:50:57.688Z). === 2025-10-03 === * 06:12 [[gitlab:javiermonton|@javiermonton]] was approved. === 2025-10-02 === * 20:15 "talaqalotaibipmp" was rejected (pending since 2025-07-03T20:13:05.164Z). === 2025-10-01 === * 10:54 "bjensen" was rejected (pending since 2025-07-02T10:53:46.574Z). * 02:45 "kowal1984" was rejected (pending since 2025-07-02T02:44:56.946Z). === 2025-09-30 === * 21:21 [[gitlab:kavaljeetsingh|@kavaljeetsingh]] was approved. * 00:24 "adium" was rejected (pending since 2025-07-01T00:23:43.807Z). === 2025-09-28 === * 08:54 [[gitlab:pexerik|@pexerik]] was approved. === 2025-09-27 === * 13:57 [[gitlab:rubahhitamvukova|@rubahhitamvukova]] was approved. === 2025-09-26 === * 16:57 "algorithmic" was rejected (pending since 2025-06-27T16:56:17.480Z). * 13:54 [[gitlab:shadabgdg|@shadabgdg]] was approved. * 13:12 [[gitlab:spushpit|@spushpit]] was approved. === 2025-09-20 === * 14:06 "bwiki" was rejected (pending since 2025-06-21T13:59:14.749Z). === 2025-09-16 === * 05:39 [[gitlab:deepchirp|@deepchirp]] was approved. === 2025-09-15 === * 22:00 [[gitlab:noisk8|@noisk8]] was approved. * 11:03 "ahonc" was rejected (pending since 2025-06-16T11:00:54.843Z). === 2025-09-13 === * 18:24 "a-ssh22" was rejected (pending since 2025-06-14T18:23:33.937Z). * 12:36 [[gitlab:rajashreetalukdar|@rajashreetalukdar]] was approved. * 00:45 [[gitlab:sumitsurai|@sumitsurai]] was approved. === 2025-09-12 === * 17:12 [[gitlab:suyash23|@suyash23]] was approved. * 00:46 "remotetravel" was rejected (pending since 2025-06-13T00:44:08.171Z). === 2025-09-10 === * 21:09 "jancborchardt" was rejected (pending since 2025-06-11T21:06:30.759Z). === 2025-09-09 === * 17:03 [[gitlab:vwf|@vwf]] was approved. * 06:36 [[gitlab:cactusisme|@cactusisme]] was approved. === 2025-09-08 === * 18:09 "birushandegeya" was rejected (pending since 2025-06-09T18:08:00.087Z). * 16:27 "ngarnsworthy" was rejected (pending since 2025-06-09T16:24:37.213Z). * 12:33 "zolgoyo" was rejected (pending since 2025-06-09T12:31:34.199Z). === 2025-09-06 === * 23:09 [[gitlab:jaishsingh913|@jaishsingh913]] was approved. === 2025-09-05 === * 21:45 [[gitlab:sakshi2|@sakshi2]] was approved. * 20:42 "abdukhaliq1" was rejected (pending since 2025-06-06T20:40:42.023Z). * 14:27 "beubsamy" was rejected (pending since 2025-06-06T14:27:06.781Z). === 2025-09-04 === * 23:27 "sdhehua" was rejected (pending since 2025-06-05T23:24:45.777Z). * 19:00 [[gitlab:perry|@perry]] was approved. * 11:24 "saintwolf" was rejected (pending since 2025-06-05T11:21:20.176Z). === 2025-09-02 === * 05:48 [[gitlab:aliu|@aliu]] was approved. === 2025-08-29 === * 13:30 "kksurendran066" was rejected (pending since 2025-05-30T13:27:48.755Z). === 2025-08-28 === * 22:18 "tauraamuix" was rejected (pending since 2025-05-29T22:16:08.228Z). === 2025-08-26 === * 19:03 [[gitlab:dikkulah|@dikkulah]] was approved. === 2025-08-22 === * 23:51 [[gitlab:khoroshun_mike|@khoroshun_mike]] was approved. === 2025-08-21 === * 07:39 [[gitlab:yuka|@yuka]] was approved. === 2025-08-19 === * 07:48 [[gitlab:zhaofjx|@zhaofjx]] was approved. === 2025-08-17 === * 14:27 "madhan13k" was rejected (pending since 2025-05-18T14:26:08.973Z). === 2025-08-15 === * 10:15 "mohammed_abukhadra" was rejected (pending since 2025-05-16T10:14:48.403Z). === 2025-08-11 === * 11:48 "hmmyesbro" was rejected (pending since 2025-05-12T11:45:24.350Z). === 2025-08-10 === * 13:15 [[gitlab:dactyl|@dactyl]] was approved. === 2025-08-09 === * 04:39 "xxxx100000" was rejected (pending since 2025-05-10T04:37:44.949Z). === 2025-08-08 === * 14:33 [[gitlab:josefanthony|@josefanthony]] was approved. === 2025-08-07 === * 23:42 [[gitlab:robins7|@robins7]] was approved. * 21:42 [[gitlab:pols12|@pols12]] was approved. * 17:15 "sbronson" was rejected (pending since 2025-05-08T17:15:08.834Z). * 14:57 [[gitlab:alvindulle|@alvindulle]] was approved. * 14:45 [[gitlab:xentos|@xentos]] was approved. * 06:27 "jamesboste" was rejected (pending since 2025-05-08T06:25:14.793Z). * 03:57 "ysun" was rejected (pending since 2025-05-08T03:55:07.348Z). === 2025-08-06 === * 21:51 "pols12" was rejected (pending since 2025-05-07T21:49:13.598Z). * 01:51 "okeamah" was rejected (pending since 2025-05-07T01:48:50.114Z). === 2025-08-05 === * 09:15 "mobashir-2013" was rejected (pending since 2025-05-06T09:14:24.069Z). === 2025-08-01 === * 08:00 "douginamug" was rejected (pending since 2025-05-02T07:57:38.317Z). === 2025-07-31 === * 02:30 [[gitlab:ads|@ads]] was approved. === 2025-07-27 === * 13:15 "mrico2703" was rejected (pending since 2025-04-27T13:13:12.346Z). * 10:17 [[gitlab:josephfrancis12|@josephfrancis12]] was approved. * 10:17 [[gitlab:fuzzew|@fuzzew]] was approved. * 05:57 [[gitlab:biscuitbobby|@biscuitbobby]] was approved. * 05:48 [[gitlab:ecoholic|@ecoholic]] was approved. === 2025-07-26 === * 11:48 [[gitlab:chimnayyyy|@chimnayyyy]] was approved. * 11:48 [[gitlab:alwinalbert|@alwinalbert]] was approved. * 11:48 [[gitlab:hridyakk|@hridyakk]] was approved. * 11:45 [[gitlab:gaurigupta21|@gaurigupta21]] was approved. * 11:45 [[gitlab:binetaa|@binetaa]] was approved. * 10:21 [[gitlab:jyothikat22|@jyothikat22]] was approved. * 10:21 [[gitlab:zobotrombie|@zobotrombie]] was approved. * 10:21 [[gitlab:flykrth|@flykrth]] was approved. * 10:21 [[gitlab:mehrinshamim|@mehrinshamim]] was approved. * 10:21 [[gitlab:aadhi13|@aadhi13]] was approved. * 10:21 [[gitlab:malavikam05|@malavikam05]] was approved. * 10:18 [[gitlab:nf609|@nf609]] was approved. * 05:48 [[gitlab:nazalnihad|@nazalnihad]] was approved. * 05:48 [[gitlab:naveen28204280|@naveen28204280]] was approved. === 2025-07-25 === * 09:49 [[gitlab:kasyap9|@kasyap9]] was approved. * 09:30 [[gitlab:swayamagrahari|@swayamagrahari]] was approved. === 2025-07-24 === * 19:36 [[gitlab:madutgn|@madutgn]] was approved. === 2025-07-23 === * 20:09 [[gitlab:somerandomdeveloper|@somerandomdeveloper]] was approved. === 2025-07-22 === * 00:15 [[gitlab:iagoqnsi|@iagoqnsi]] was approved. === 2025-07-21 === * 17:30 [[gitlab:asadiqui|@asadiqui]] was approved. * 16:39 [[gitlab:tryvix1509|@tryvix1509]] was approved. * 04:27 [[gitlab:damian|@damian]] was approved. === 2025-07-20 === * 09:42 "mike-khoroshun" was rejected (pending since 2025-04-20T09:42:22.732Z). === 2025-07-17 === * 17:57 [[gitlab:haroldkrabs|@haroldkrabs]] was approved. * 13:45 [[gitlab:envlh|@envlh]] was approved. === 2025-07-14 === * 10:24 [[gitlab:missguru|@missguru]] was approved. * 00:57 "clarfonthey" was rejected (pending since 2025-04-14T00:56:32.626Z). === 2025-07-13 === * 01:01 [[gitlab:l235|@l235]] was approved. === 2025-07-11 === * 03:06 "rodavlas" was rejected (pending since 2025-04-11T03:05:45.590Z). === 2025-07-06 === * 00:09 "lakasa" was rejected (pending since 2025-04-06T00:06:28.469Z). === 2025-07-05 === * 21:54 "ctrlzvi" was rejected (pending since 2025-04-05T21:54:12.542Z). * 14:30 "aminualiyu" was rejected (pending since 2025-04-05T14:27:22.617Z). === 2025-07-04 === * 03:15 [[gitlab:galstar|@galstar]] was approved. === 2025-07-02 === * 11:27 "vicolas11" was rejected (pending since 2025-04-02T11:25:12.682Z). === 2025-06-29 === * 23:12 "naomi723" was rejected (pending since 2025-03-30T23:09:24.630Z). === 2025-06-28 === * 16:21 "mudeh2372" was rejected (pending since 2025-03-29T16:18:27.057Z). === 2025-06-27 === * 23:18 "rony143" was rejected (pending since 2025-03-28T23:16:13.671Z). * 22:21 [[gitlab:rluts|@rluts]] was approved. === 2025-06-26 === * 13:54 "creativegurus" was rejected (pending since 2025-03-27T13:52:41.706Z). === 2025-06-24 === * 17:42 [[gitlab:devjadiya|@devjadiya]] was approved. * 14:00 "dominic-r" was rejected (pending since 2025-03-25T14:00:07.307Z). === 2025-06-21 === * 00:48 [[gitlab:vriaa|@vriaa]] was approved. === 2025-06-18 === * 15:21 "ayushkhati1" was rejected (pending since 2025-03-19T15:18:50.062Z). === 2025-06-17 === * 20:45 "chiomavero" was rejected (pending since 2025-03-18T20:44:13.967Z). * 00:27 [[gitlab:eggroll97|@eggroll97]] was approved. === 2025-06-14 === * 20:57 "volvox" was rejected (pending since 2025-03-15T20:56:34.018Z). === 2025-06-13 === * 16:09 [[gitlab:supergrey|@supergrey]] was approved. * 11:03 "chqaz" was rejected (pending since 2025-03-14T11:01:09.600Z). * 10:24 [[gitlab:slong-wmf|@slong-wmf]] was approved. * 10:15 "hearvox" was rejected (pending since 2025-03-14T10:13:13.112Z). === 2025-06-12 === * 15:18 "jlam" was rejected (pending since 2025-03-13T15:17:54.099Z). === 2025-06-09 === * 20:48 "dipanjansengupta" was rejected (pending since 2025-03-10T20:48:03.545Z). * 19:27 [[gitlab:reggycelly|@reggycelly]] was approved. * 14:51 "arendpieter" was rejected (pending since 2025-03-10T14:51:01.445Z). * 13:21 [[gitlab:greenreaper|@greenreaper]] was approved. * 09:33 [[gitlab:mmta|@mmta]] was approved. * 08:03 "a-ssh22" was rejected (pending since 2025-03-10T08:03:08.111Z). === 2025-06-08 === * 21:06 "mm-episodenlistedlvaupdater" was rejected (pending since 2025-03-09T21:04:06.323Z). === 2025-06-06 === * 11:06 [[gitlab:olea|@olea]] was approved. === 2025-06-05 === * 20:33 [[gitlab:encodedwp|@encodedwp]] was approved. * 15:00 [[gitlab:toluayo|@toluayo]] was approved. * 13:51 [[gitlab:arnold_lup|@arnold_lup]] was approved. * 11:54 "sdhehua" was rejected (pending since 2025-03-06T11:51:48.241Z). === 2025-06-03 === * 21:27 [[gitlab:wewakey|@wewakey]] was approved. * 12:36 "hunsimon2" was rejected (pending since 2025-03-04T12:34:56.520Z). * 11:54 "hunsimon" was rejected (pending since 2025-03-04T11:53:54.652Z). === 2025-06-02 === * 12:01 [[gitlab:jaimedes|@jaimedes]] was approved. === 2025-05-30 === * 18:00 "sathvik9105" was rejected (pending since 2025-02-28T17:59:42.867Z). * 11:21 [[gitlab:tonythomas01|@tonythomas01]] was approved. * 10:06 [[gitlab:gpsleo|@gpsleo]] was approved. === 2025-05-29 === * 22:12 [[gitlab:codynguyen1116|@codynguyen1116]] was approved. === 2025-05-28 === * 02:57 [[gitlab:saper|@saper]] was approved. === 2025-05-27 === * 21:06 [[gitlab:mohammed_qays|@mohammed_qays]] was approved. * 15:33 "satanluimm" was rejected (pending since 2025-02-25T15:32:48.101Z). === 2025-05-26 === * 23:57 "seyedali220" was rejected (pending since 2025-02-24T23:56:17.621Z). === 2025-05-21 === * 11:12 [[gitlab:guilherme|@guilherme]] was approved. === 2025-05-19 === * 13:24 [[gitlab:emojiwiki|@emojiwiki]] was approved. === 2025-05-18 === * 00:00 "xidme" was rejected (pending since 2025-02-15T23:58:56.796Z). === 2025-05-17 === * 02:39 "kdh8219" was rejected (pending since 2025-02-15T02:36:32.237Z). === 2025-05-16 === * 15:09 [[gitlab:maxbinderwmf|@maxbinderwmf]] was approved. === 2025-05-15 === * 04:30 "inspectorzer0" was rejected (pending since 2025-02-13T04:27:33.179Z). === 2025-05-14 === * 17:42 [[gitlab:llugo|@llugo]] was approved. === 2025-05-13 === * 20:18 "mmta" was rejected (pending since 2025-02-11T20:17:23.407Z). === 2025-05-11 === * 20:51 "jad" was rejected (pending since 2025-02-09T20:49:07.333Z). * 17:54 "nishchalsundan" was rejected (pending since 2025-02-09T17:52:25.761Z). * 16:39 "mohammed_abukhadra" was rejected (pending since 2025-02-09T16:39:03.730Z). === 2025-05-09 === * 09:12 [[gitlab:sirchanmp|@sirchanmp]] was approved. === 2025-05-08 === * 08:18 [[gitlab:mengeditch|@mengeditch]] was approved. === 2025-05-07 === * 03:45 "xluffy" was rejected (pending since 2025-02-05T03:45:14.181Z). === 2025-05-06 === * 16:54 "punhaniabhishek" was rejected (pending since 2025-02-04T16:53:50.758Z). * 09:36 [[gitlab:bmartinezcalvo|@bmartinezcalvo]] was approved. === 2025-05-02 === * 12:24 [[gitlab:tohaomg|@tohaomg]] was approved. * 11:48 [[gitlab:mavrikant|@mavrikant]] was approved. * 11:45 [[gitlab:daanvr|@daanvr]] was approved. === 2025-05-01 === * 09:09 "mjoerg" was rejected (pending since 2025-01-30T09:09:04.204Z). === 2025-04-30 === * 23:06 "sanskardubey" was rejected (pending since 2025-01-29T23:03:25.489Z). === 2025-04-29 === * 16:00 "geyslein" was rejected (pending since 2025-01-28T16:00:01.510Z). === 2025-04-26 === * 09:30 "anjali9027" was rejected (pending since 2025-01-25T09:28:07.064Z). === 2025-04-25 === * 18:00 "salahhazaa" was rejected (pending since 2025-01-24T17:58:30.030Z). * 15:15 [[gitlab:yiming|@yiming]] was approved. * 02:06 "mrchanmp" was rejected (pending since 2025-01-24T02:03:58.308Z). === 2025-04-23 === * 17:03 "rj2904" was rejected (pending since 2025-01-22T17:03:11.207Z). * 14:21 "nischay33" was rejected (pending since 2025-01-22T14:19:21.081Z). === 2025-04-22 === * 19:27 "dj80" was rejected (pending since 2025-01-21T19:25:28.498Z). * 14:30 [[gitlab:kaimamin|@kaimamin]] was approved. * 09:57 "debo" was rejected (pending since 2025-01-21T09:54:47.955Z). === 2025-04-21 === * 12:24 "unshell" was rejected (pending since 2025-01-20T12:21:59.686Z). === 2025-04-18 === * 15:06 [[gitlab:spartanarbinger|@spartanarbinger]] was approved. === 2025-04-16 === * 03:09 "dewey" was rejected (pending since 2025-01-15T03:06:17.488Z). === 2025-04-15 === * 19:45 "emdadul" was rejected (pending since 2025-01-14T19:42:29.285Z). === 2025-04-14 === * 06:45 [[gitlab:bcampbell804|@bcampbell804]] was approved. === 2025-04-11 === * 06:27 [[gitlab:jvanderhoop|@jvanderhoop]] was approved. === 2025-04-10 === * 04:12 "bhai420" was rejected (pending since 2025-01-09T04:10:29.430Z). === 2025-04-09 === * 05:03 "austinvarshney" was rejected (pending since 2025-01-08T05:02:34.175Z). === 2025-04-06 === * 15:36 [[gitlab:elph|@elph]] was approved. === 2025-04-02 === * 10:33 [[gitlab:ozge|@ozge]] was approved. === 2025-03-31 === * 20:15 "demandkey" was rejected (pending since 2024-12-30T20:14:23.096Z). * 15:18 [[gitlab:danyya|@danyya]] was approved. === 2025-03-28 === * 15:54 [[gitlab:rutsavi09|@rutsavi09]] was approved. * 15:54 [[gitlab:ilanen1|@ilanen1]] was approved. === 2025-03-25 === * 19:27 [[gitlab:irfo|@irfo]] was approved. * 11:54 [[gitlab:kmontalva-wmf|@kmontalva-wmf]] was approved. * 04:33 [[gitlab:paul26|@paul26]] was approved. * 04:18 "as1100k" was rejected (pending since 2024-12-24T04:18:06.813Z). === 2025-03-24 === * 11:33 "amzadkhankk" was rejected (pending since 2024-12-23T11:33:14.176Z). === 2025-03-23 === * 12:24 "wolfdo" was rejected (pending since 2024-12-22T12:23:35.056Z). === 2025-03-22 === * 09:45 [[gitlab:fjmustak|@fjmustak]] was approved. === 2025-03-20 === * 18:42 "sathishkokila" was rejected (pending since 2024-12-19T18:39:35.161Z). * 17:03 [[gitlab:alien4444|@alien4444]] was approved. * 15:27 [[gitlab:davidcoronel|@davidcoronel]] was approved. === 2025-03-19 === * 22:57 [[gitlab:r1f4t|@r1f4t]] was approved. * 19:03 "daniel24ps" was rejected (pending since 2024-12-18T19:00:21.249Z). * 14:18 [[gitlab:beepbooppenguin|@beepbooppenguin]] was approved. === 2025-03-18 === * 17:48 "rahulkundu1209" was rejected (pending since 2024-12-17T17:46:41.936Z). * 08:15 "kirtisikka972" was rejected (pending since 2024-12-17T08:13:25.487Z). === 2025-03-15 === * 13:30 "tulspal_sidhu" was rejected (pending since 2024-12-14T13:29:10.606Z). * 01:39 "peacedeadc" was rejected (pending since 2024-12-14T01:37:36.579Z). === 2025-03-14 === * 03:51 [[gitlab:chuckthebuck|@chuckthebuck]] was approved. * 02:33 "yxngtrtxll" was rejected (pending since 2024-12-13T02:31:51.658Z). === 2025-03-13 === * 14:36 [[gitlab:iccander|@iccander]] was approved. === 2025-03-12 === * 23:21 "jokerchic36" was rejected (pending since 2024-12-11T23:21:00.670Z). * 15:30 [[gitlab:naomi|@naomi]] was approved. * 15:27 [[gitlab:cobi|@cobi]] was approved. === 2025-03-11 === * 12:42 "mohitvermaxx" was rejected (pending since 2024-12-10T12:40:56.967Z). === 2025-03-10 === * 16:51 [[gitlab:nanona15dobato|@nanona15dobato]] was approved. === 2025-03-09 === * 22:39 [[gitlab:jonkolbert|@jonkolbert]] was approved. * 20:45 [[gitlab:urbanecmtest2|@urbanecmtest2]] was approved. === 2025-03-07 === * 16:54 [[gitlab:hswan|@hswan]] was approved. * 14:42 [[gitlab:atitkov|@atitkov]] was approved. * 00:42 [[gitlab:infrastruktur|@infrastruktur]] was approved. === 2025-03-06 === * 17:21 "johnmann" was rejected (pending since 2024-12-05T17:19:24.995Z). === 2025-03-05 === * 07:33 [[gitlab:monx9494|@monx9494]] was approved. === 2025-03-02 === * 21:21 "paul26" was rejected (pending since 2024-12-01T21:20:19.681Z). === 2025-03-01 === * 19:15 [[gitlab:izno|@izno]] was approved. * 12:45 [[gitlab:nyerho|@nyerho]] was approved. === 2025-02-28 === * 18:27 [[gitlab:chuckonwumelu|@chuckonwumelu]] was approved. * 13:09 "ashwinpraveengo" was rejected (pending since 2024-11-29T13:07:47.240Z). * 00:18 "eduardoaugusto" was rejected (pending since 2024-11-29T00:17:43.372Z). === 2025-02-27 === * 20:39 "volkanurl" was rejected (pending since 2024-11-28T20:37:18.101Z). === 2025-02-24 === * 21:15 [[gitlab:feeglgeef|@feeglgeef]] was approved. * 20:18 [[gitlab:piaanalysis2|@piaanalysis2]] was approved. * 19:06 [[gitlab:dhardy|@dhardy]] was approved. === 2025-02-22 === * 19:27 [[gitlab:owuh|@owuh]] was approved. === 2025-02-19 === * 16:06 [[gitlab:artemkloko|@artemkloko]] was approved. * 13:03 [[gitlab:jgafnea|@jgafnea]] was approved. === 2025-02-17 === * 16:33 [[gitlab:asmartkitten|@asmartkitten]] was approved. === 2025-02-16 === * 19:12 "gaurigupta21" was rejected (pending since 2024-11-17T19:11:07.416Z). === 2025-02-15 === * 01:18 [[gitlab:mediawiki-quickstart-ci|@mediawiki-quickstart-ci]] was approved. === 2025-02-14 === * 15:21 "nathanbnm" was rejected (pending since 2024-11-15T15:18:19.632Z). === 2025-02-13 === * 16:45 [[gitlab:priyanshuchahal|@priyanshuchahal]] was approved. * 16:42 [[gitlab:ajhalili2006|@ajhalili2006]] was approved. === 2025-02-12 === * 23:21 "monkeypatch999" was rejected (pending since 2024-11-13T23:20:38.398Z). * 06:36 [[gitlab:jainlakshita28|@jainlakshita28]] was approved. === 2025-02-11 === * 19:27 [[gitlab:matthewsm2|@matthewsm2]] was approved. === 2025-02-09 === * 16:15 "mohammed_abukhadra" was rejected (pending since 2024-11-10T16:15:18.361Z). === 2025-02-07 === * 21:33 "brennan" was rejected (pending since 2024-11-08T21:31:07.351Z). === 2025-02-06 === * 08:24 "mmta" was rejected (pending since 2024-11-07T08:22:36.724Z). * 06:21 [[gitlab:bunnypranav|@bunnypranav]] was approved. === 2025-02-05 === * 22:39 "chrissteinchen" was rejected (pending since 2024-11-06T22:38:16.673Z). === 2025-02-03 === * 07:45 "edriiic" was rejected (pending since 2024-11-04T07:44:46.849Z). * 01:12 "geppy" was rejected (pending since 2024-11-04T01:10:48.710Z). === 2025-02-02 === * 13:18 "funa-enpitu" was rejected (pending since 2024-11-03T13:15:46.065Z). === 2025-01-31 === * 23:42 "nfontes" was rejected (pending since 2024-11-01T23:39:41.755Z). * 22:51 "sbronson" was rejected (pending since 2024-11-01T22:50:31.871Z). * 00:42 [[gitlab:farid|@farid]] was approved. === 2025-01-27 === * 08:15 [[gitlab:eliza189|@eliza189]] was approved. === 2025-01-25 === * 09:51 [[gitlab:pamputt|@pamputt]] was approved. === 2025-01-23 === * 14:30 [[gitlab:lubianat|@lubianat]] was approved. * 11:45 [[gitlab:bootsa|@bootsa]] was approved. === 2025-01-21 === * 05:09 "niko" was rejected (pending since 2024-07-21T16:10:01.377Z). * 05:09 "thawizkid369777" was rejected (pending since 2024-07-18T17:42:44.493Z). * 05:09 "sarthaksingh2" was rejected (pending since 2024-07-10T11:31:30.470Z). * 05:09 "shriyakt" was rejected (pending since 2024-07-06T04:54:10.248Z). * 05:09 "akshaya" was rejected (pending since 2024-07-06T04:04:51.488Z). * 05:09 "alaka03aj" was rejected (pending since 2024-07-05T18:01:54.876Z). * 05:09 "sulochanaviji-5049" was rejected (pending since 2024-07-01T05:58:00.427Z). * 05:09 "nayanjnath" was rejected (pending since 2024-07-01T02:51:57.405Z). * 05:09 "sd44" was rejected (pending since 2024-06-30T04:28:51.436Z). * 05:09 "metavalent" was rejected (pending since 2024-06-29T01:37:14.210Z). * 05:09 "wicloudx" was rejected (pending since 2024-06-28T11:51:23.335Z). * 05:09 "debo" was rejected (pending since 2024-06-28T01:44:59.845Z). * 05:09 "bwiki" was rejected (pending since 2024-06-23T14:15:38.032Z). * 05:09 "toprak" was rejected (pending since 2024-06-23T11:35:50.819Z). * 05:09 "iristeller" was rejected (pending since 2024-06-14T20:53:48.959Z). * 05:09 "jcolvin" was rejected (pending since 2024-06-12T17:29:01.238Z). * 05:09 "kalyan" was rejected (pending since 2024-06-07T07:52:46.993Z). * 05:09 "bluecrystal" was rejected (pending since 2024-06-06T19:16:20.107Z). * 05:09 "iftttrohit" was rejected (pending since 2024-06-04T12:08:50.818Z). * 05:09 "pogpotato" was rejected (pending since 2024-06-03T17:58:21.684Z). * 05:09 "cptlausebaer" was rejected (pending since 2024-05-31T18:53:27.692Z). * 05:09 "hdevine825" was rejected (pending since 2024-05-31T17:04:18.279Z). * 05:09 "anaghaa18" was rejected (pending since 2024-05-25T19:14:31.803Z). * 05:09 "atharvanair04" was rejected (pending since 2024-05-25T14:24:52.825Z). * 05:09 "anasvemmully" was rejected (pending since 2024-05-25T06:10:27.261Z). * 05:09 "abhinavmohandas" was rejected (pending since 2024-05-25T06:05:24.825Z). * 05:09 "kksurendran06" was rejected (pending since 2024-05-25T06:04:38.082Z). * 05:09 "albertmarshall8896" was rejected (pending since 2024-05-23T09:32:05.462Z). * 05:09 "akellison" was rejected (pending since 2024-05-17T02:07:24.229Z). * 05:09 "mainowill" was rejected (pending since 2024-04-16T23:30:33.881Z). * 05:09 "bzhqc" was rejected (pending since 2024-04-16T19:50:38.676Z). * 05:09 "safan41" was rejected (pending since 2024-04-16T03:34:48.942Z). * 05:09 "mgagat" was rejected (pending since 2024-04-16T03:21:51.764Z). * 05:09 "okeamah" was rejected (pending since 2024-04-16T02:49:00.143Z). * 05:09 "xuhao61" was rejected (pending since 2024-04-15T23:45:09.083Z). * 04:47 "cybel" was rejected (pending since 2024-04-15T06:46:35.791Z). === 2025-01-20 === * 14:33 [[gitlab:your1|@your1]] was approved. === 2025-01-18 === * 10:09 [[gitlab:galrach600|@galrach600]] was approved. * 02:51 [[gitlab:blankeclair|@blankeclair]] was approved. === 2025-01-17 === * 13:57 [[gitlab:dsantamaria|@dsantamaria]] was approved. === 2025-01-15 === * 17:12 [[gitlab:smartse|@smartse]] was approved. === 2025-01-14 === * 17:03 [[gitlab:naorleizer|@naorleizer]] was approved. === 2025-01-13 === * 02:45 [[gitlab:wolf20482|@wolf20482]] was approved. === 2025-01-12 === * 17:45 [[gitlab:tamzin|@tamzin]] was approved. === 2025-01-11 === * 15:24 [[gitlab:bargioni|@bargioni]] was approved. * 14:30 [[gitlab:salelya|@salelya]] was approved. * 10:15 [[gitlab:malakatshy|@malakatshy]] was approved. * 05:21 [[gitlab:newmcpee|@newmcpee]] was approved. === 2025-01-09 === * 15:30 [[gitlab:gkyziridis|@gkyziridis]] was approved. === 2025-01-08 === * 16:21 [[gitlab:ukrface|@ukrface]] was approved. === 2024-12-28 === * 03:27 [[gitlab:twonum|@twonum]] was approved. === 2024-12-25 === * 06:09 [[gitlab:harsv567|@harsv567]] was approved. === 2024-12-21 === * 11:24 [[gitlab:amutha2002|@amutha2002]] was approved. === 2024-12-20 === * 19:51 [[gitlab:hridyeshgupta|@hridyeshgupta]] was approved. * 10:00 [[gitlab:ro-shines|@ro-shines]] was approved. * 08:09 [[gitlab:kesharwaniarpita|@kesharwaniarpita]] was approved. === 2024-12-18 === * 14:45 [[gitlab:soylacarli|@soylacarli]] was approved. === 2024-12-16 === * 20:33 [[gitlab:aleyasiddika1|@aleyasiddika1]] was approved. === 2024-12-15 === * 07:33 [[gitlab:abhishek02bhardwaj|@abhishek02bhardwaj]] was approved. === 2024-12-13 === * 13:18 [[gitlab:ashmitabathre204|@ashmitabathre204]] was approved. === 2024-12-10 === * 06:39 [[gitlab:ginaan|@ginaan]] was approved. === 2024-12-09 === * 05:45 [[gitlab:kallinavya|@kallinavya]] was approved. * 00:54 [[gitlab:viserion-7|@viserion-7]] was approved. === 2024-12-08 === * 17:27 [[gitlab:wargo|@wargo]] was approved. === 2024-12-05 === * 11:15 [[gitlab:ranjithraj|@ranjithraj]] was approved. === 2024-12-02 === * 21:21 [[gitlab:a930913|@a930913]] was approved. === 2024-12-01 === * 02:39 [[gitlab:kingchristlike1|@kingchristlike1]] was approved. === 2024-11-21 === * 13:45 [[gitlab:sascha|@sascha]] was approved. === 2024-11-19 === * 16:36 [[gitlab:jly|@jly]] was approved. === 2024-11-15 === * 02:54 [[gitlab:danielyepezgarces|@danielyepezgarces]] was approved. === 2024-11-14 === * 14:15 [[gitlab:stimoroll|@stimoroll]] was approved. === 2024-11-09 === * 17:15 [[gitlab:f4udeveloper|@f4udeveloper]] was approved. === 2024-11-07 === * 19:15 [[gitlab:zulf|@zulf]] was approved. * 05:33 [[gitlab:hassanamin|@hassanamin]] was approved. === 2024-11-06 === * 19:39 [[gitlab:daniuu|@daniuu]] was approved. * 00:18 [[gitlab:rlopez-wmf|@rlopez-wmf]] was approved. === 2024-10-09 === * 14:45 [[gitlab:jtweed|@jtweed]] was approved. * 10:24 [[gitlab:ifrahkh|@ifrahkh]] was approved. * 09:06 [[gitlab:wikibayer|@wikibayer]] was approved. === 2024-10-06 === * 10:27 [[gitlab:keerthan16|@keerthan16]] was approved. === 2024-10-04 === * 07:45 [[gitlab:hakimi97|@hakimi97]] was approved. === 2024-09-30 === * 07:39 [[gitlab:ninjastrikers|@ninjastrikers]] was approved. === 2024-09-28 === * 17:30 [[gitlab:webrunner95|@webrunner95]] was approved. === 2024-09-18 === * 21:39 [[gitlab:elliottetzkorn|@elliottetzkorn]] was approved. === 2024-09-14 === * 22:06 [[gitlab:humptydumpty|@humptydumpty]] was approved. === 2024-09-06 === * 08:48 [[gitlab:mickabarber|@mickabarber]] was approved. === 2024-08-27 === * 17:36 [[gitlab:edgars|@edgars]] was approved. === 2024-08-22 === * 09:18 [[gitlab:antonkokhwmde|@antonkokhwmde]] was approved. === 2024-08-14 === * 19:21 [[gitlab:jfk|@jfk]] was approved. === 2024-08-13 === * 17:57 [[gitlab:daxserver|@daxserver]] was approved. === 2024-08-11 === * 09:57 [[gitlab:pauliesnug|@pauliesnug]] was approved. === 2024-08-10 === * 08:42 [[gitlab:ashig|@ashig]] was approved. === 2024-08-09 === * 14:09 [[gitlab:masssly|@masssly]] was approved. === 2024-08-05 === * 22:15 [[gitlab:mrtortue|@mrtortue]] was approved. === 2024-08-02 === * 16:21 [[gitlab:dsantini|@dsantini]] was approved. === 2024-07-31 === * 11:54 [[gitlab:cptviraj|@cptviraj]] was approved. === 2024-07-30 === * 19:09 [[gitlab:iniquity|@iniquity]] was approved. * 10:00 [[gitlab:collins|@collins]] was approved. === 2024-07-27 === * 15:57 [[gitlab:songnguxyz|@songnguxyz]] was approved. === 2024-07-25 === * 12:36 [[gitlab:mszabo|@mszabo]] was approved. * 09:21 [[gitlab:agarwalmahima|@agarwalmahima]] was approved. === 2024-07-24 === * 08:05 [[gitlab:dragoniez|@dragoniez]] was approved. === 2024-07-23 === * 06:54 [[gitlab:mirji|@mirji]] was approved. === 2024-07-16 === * 10:00 [[gitlab:lakejason0|@lakejason0]] was approved. === 2024-07-12 === * 11:33 [[gitlab:cn|@cn]] was approved. * 08:12 [[gitlab:unchampignon|@unchampignon]] was approved. === 2024-07-07 === * 17:12 [[gitlab:agamyasamuel|@agamyasamuel]] was approved. * 05:24 [[gitlab:kuldeepburjbhalaike|@kuldeepburjbhalaike]] was approved. === 2024-07-06 === * 11:18 [[gitlab:dibya|@dibya]] was approved. * 04:54 [[gitlab:sarthakparashar|@sarthakparashar]] was approved. === 2024-07-05 === * 18:15 [[gitlab:vanshikarathi|@vanshikarathi]] was approved. === 2024-07-02 === * 19:00 [[gitlab:ebrahim|@ebrahim]] was approved. === 2024-07-01 === * 20:12 [[gitlab:rockingpenny4|@rockingpenny4]] was approved. * 18:15 [[gitlab:balajijagadesh|@balajijagadesh]] was approved. === 2024-06-30 === * 18:24 [[gitlab:hrideshmg|@hrideshmg]] was approved. * 07:18 [[gitlab:chanakyakumardas|@chanakyakumardas]] was approved. * 06:30 [[gitlab:rihaan180|@rihaan180]] was approved. === 2024-06-27 === * 17:36 [[gitlab:driedmueller|@driedmueller]] was approved. === 2024-06-19 === * 12:57 [[gitlab:audreypenven|@audreypenven]] was approved. === 2024-06-16 === * 01:18 [[gitlab:roysmith|@roysmith]] was approved. === 2024-06-08 === * 02:45 [[gitlab:jleedev|@jleedev]] was approved. === 2024-06-03 === * 13:57 [[gitlab:afeder|@afeder]] was approved. === 2024-06-01 === * 10:54 [[gitlab:florianschmitt|@florianschmitt]] was approved. === 2024-05-30 === * 16:42 [[gitlab:krlsca|@krlsca]] was approved. === 2024-05-28 === * 11:24 [[gitlab:rickijay|@rickijay]] was approved. === 2024-05-26 === * 11:18 [[gitlab:ranjithsiji|@ranjithsiji]] was approved. === 2024-05-25 === * 07:24 [[gitlab:jony|@jony]] was approved. === 2024-05-23 === * 08:45 [[gitlab:lepticed7|@lepticed7]] was approved. === 2024-05-22 === * 20:42 [[gitlab:echecs|@echecs]] was approved. === 2024-05-21 === * 13:33 [[gitlab:mbs|@mbs]] was approved. === 2024-05-19 === * 18:06 [[gitlab:ionenlaser|@ionenlaser]] was approved. === 2024-05-18 === * 23:36 [[gitlab:mdaniels5757|@mdaniels5757]] was approved. === 2024-05-17 === * 08:54 [[gitlab:grapedog|@grapedog]] was approved. === 2024-05-08 === * 19:42 [[gitlab:kelhurd|@kelhurd]] was approved. * 19:06 [[gitlab:khurd|@khurd]] was approved. === 2024-05-06 === * 19:48 [[gitlab:j3j5|@j3j5]] was approved. * 12:06 [[gitlab:tk-999|@tk-999]] was approved. === 2024-05-05 === * 22:09 [[gitlab:pppery|@pppery]] was approved. * 20:33 [[gitlab:sakretsu|@sakretsu]] was approved. * 12:12 [[gitlab:waterquark|@waterquark]] was approved. === 2024-05-04 === * 09:03 [[gitlab:multichill|@multichill]] was approved. * 07:42 [[gitlab:abaris|@abaris]] was approved. === 2024-05-03 === * 14:57 [[gitlab:maurusian|@maurusian]] was approved. === 2024-04-24 === * 05:48 [[gitlab:wolfinux|@wolfinux]] was approved. === 2024-04-23 === * 15:48 [[gitlab:dreamrimmer|@dreamrimmer]] was approved. === 2024-04-21 === * 06:51 [[gitlab:alon|@alon]] was approved. === 2024-04-17 === * 23:33 [[gitlab:derenrich|@derenrich]] was approved. === 2024-04-16 === * 17:18 [[gitlab:valcio|@valcio]] was approved. === 2024-04-14 === * 16:51 [[gitlab:wikilucas00|@wikilucas00]] was approved. === 2024-04-06 === * 12:48 [[gitlab:theprotonade|@theprotonade]] was approved. === 2024-04-02 === * 07:30 [[gitlab:bohuizhang|@bohuizhang]] was approved. === 2024-03-30 === * 13:36 [[gitlab:lpintscher|@lpintscher]] was approved. === 2024-03-26 === * 17:09 [[gitlab:eenabulele|@eenabulele]] was approved. === 2024-03-25 === * 14:27 [[gitlab:tuukka|@tuukka]] was approved. === 2024-03-24 === * 12:24 [[gitlab:firefly|@firefly]] was approved. === 2024-03-21 === * 19:33 [[gitlab:universal-omega|@universal-omega]] was approved. === 2024-03-17 === * 10:36 [[gitlab:bisel91|@bisel91]] was approved. === 2024-03-16 === * 10:09 [[gitlab:delord|@delord]] was approved. * 00:42 [[gitlab:athulvis1|@athulvis1]] was approved. === 2024-03-15 === * 19:06 [[gitlab:ignaciorodrguez|@ignaciorodrguez]] was approved. * 08:30 [[gitlab:peachey88|@peachey88]] was approved. * 06:51 [[gitlab:derick|@derick]] was approved. === 2024-03-12 === * 15:06 [[gitlab:xiaoxiao|@xiaoxiao]] was approved. === 2024-03-06 === * 13:21 [[gitlab:desianabae1|@desianabae1]] was approved. === 2024-03-05 === * 19:21 [[gitlab:ep1c|@ep1c]] was approved. * 16:33 [[gitlab:jasmine|@jasmine]] was approved. === 2024-03-02 === * 06:42 [[gitlab:potsdamlamb|@potsdamlamb]] was approved. === 2024-02-29 === * 23:18 [[gitlab:arandomname123|@arandomname123]] was approved. * 18:03 [[gitlab:baba|@baba]] was approved. * 17:48 [[gitlab:yfdyh000|@yfdyh000]] was approved. * 03:09 [[gitlab:sds|@sds]] was approved. === 2024-02-27 === * 23:33 [[gitlab:lofhi|@lofhi]] was approved. === 2024-02-15 === * 19:45 [[gitlab:gergesshamon|@gergesshamon]] was approved. === 2024-02-14 === * 14:33 [[gitlab:philipnelson99|@philipnelson99]] was approved. === 2024-02-13 === * 13:06 [[gitlab:dringsim|@dringsim]] was approved. === 2024-02-12 === * 17:36 [[gitlab:haak|@haak]] was approved. === 2024-02-05 === * 17:33 [[gitlab:qwerfjkl|@qwerfjkl]] was approved. * 17:14 [[gitlab:ahecht|@ahecht]] was approved. === 2024-02-01 === * 09:27 [[gitlab:arinaigum|@arinaigum]] was approved. * 00:15 [[gitlab:jas42|@jas42]] was approved. * 00:15 [[gitlab:edhu|@edhu]] was approved. * 00:15 [[gitlab:marnanel|@marnanel]] was approved. * 00:15 [[gitlab:ibrahemqasim|@ibrahemqasim]] was approved. * 00:15 [[gitlab:amasotti|@amasotti]] was approved. * 00:15 [[gitlab:deni|@deni]] was approved. * 00:15 [[gitlab:cyber|@cyber]] was approved. * 00:15 [[gitlab:saroj|@saroj]] was approved. === 2024-01-29 === * 21:42 [[gitlab:rgupta|@rgupta]] was approved. === 2024-01-07 === * 09:48 [[gitlab:lutrome|@lutrome]] was approved. === 2024-01-05 === * 20:48 [[gitlab:jinoytommanjaly|@jinoytommanjaly]] was approved. * 02:51 [[gitlab:braunobruno|@braunobruno]] was approved. * 01:08 [[gitlab:amorymeltzer|@amorymeltzer]] was approved. * 01:08 [[gitlab:phi22ipus|@phi22ipus]] was approved. === 2024-01-03 === * 14:45 [[gitlab:gabina|@gabina]] was approved. === 2024-01-02 === * 13:18 [[gitlab:arthurtaylor|@arthurtaylor]] was approved. === 2023-12-23 === * 00:33 [[gitlab:aram|@aram]] was approved. === 2023-12-22 === * 16:24 [[gitlab:elpitareio|@elpitareio]] was approved. === 2023-12-21 === * 00:43 [[gitlab:bsadowski1|@bsadowski1]] was approved. * 00:43 [[gitlab:ederporto|@ederporto]] was approved. * 00:43 [[gitlab:sadraiiali|@sadraiiali]] was approved. * 00:43 [[gitlab:wasp-outis|@wasp-outis]] was approved. * 00:43 [[gitlab:bodhisattwa|@bodhisattwa]] was approved. * 00:43 [[gitlab:air7538|@air7538]] was approved. * 00:43 [[gitlab:anzx|@anzx]] was approved. * 00:43 [[gitlab:tekask1903|@tekask1903]] was approved. * 00:42 [[gitlab:kiwi-0x010c|@kiwi-0x010c]] was approved. * 00:42 [[gitlab:mpaa|@mpaa]] was approved. * 00:42 [[gitlab:kutay|@kutay]] was approved. * 00:42 [[gitlab:wattmto|@wattmto]] was approved. 17fdkrfp4mvw12pedcb0nax7youxcr4