{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The Power Method part 2\n", "## How to use the power method to find other eigenvectors and eigenvalues for a symmetric matrix." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you know, the power method will find a principal eigenvector for any matrix $A$ that has a principal eigenvector. By using the fact that the eigenvectors of a symmetric matrix are orthogonal, we will be able to use the power method to find all the other eigenvectors also.\n", "\n", "Here's the key mathematical point:\n", "\n", "Suppose $A$ is a symmetric matrix, $e$ is an eigenvector for $A$, and $v$ is orthogonal to $e$. Then $Av$ is also orthogonal to $e$.\n", "\n", "Here's the argument : Suppose $A$ is symmetric, $Ae = \\lambda e$, and $\\langle v, e \\rangle = 0$. Now look at $\\langle Av, e \\rangle$: \n", "\n", "$$\\langle Av, e \\rangle = \\langle v, A e \\rangle = \\langle v, \\lambda e\\rangle = \\lambda \\langle v, e \\rangle = \\lambda \\cdot 0 = 0.$$\n", "\n", "This mathematical fact suggests the following method to find all the eigenvectors of $A$.\n", "\n", "Choose a random vector, call it $v_0$, and use it as the input vector for the power method to find the principal eigenvector $e_0$.\n", "\n", "Choose another vector, call it $v_1$, that is orthogonal to $e_0$ and use it as the input vector in the power method. Since $v_1$ is orthogonal to $e_1$, the vector $A(v_1)$ will be orthogonal to $e_0$. So, $A^2(v_1), A^3(v_1), $ and all the iterates will be orthogonal to $e_0$. Therefore, the power method using $v_1$ as the input vector will converge to a second eigenvector, call it $e_1$. It will be the principal eigenvector of the operator $A$ restricted to the subspace consisting of all vectors orthogonal to $e_0$.\n", "\n", "Choose a random vector, call it $v_2$, that is orthogonal to $e_1$ and $e_0$ and use it as the input vector for the power method. Since $v_2$ is orthogonal to $e_1$ and $e_0$, all the iterates will be orthogonal to both $e_1$ and $e_0$, and so the power method with $v_2$ as input vector will converge to a third eigenvector, call it $e_2$. And so on...\n", "\n", "In an ideal world, we might be able to make this work. Let's try it.\n", "\n", "First, let's program the power method. The input will be a matrix $A$, an initial vector $x$, and a natural number $n$. The output will be the entire list of iterates:\n", "\n", "$$x, Ax, A^2x, A^3x, ..., A^nx$$ \n", "\n", "normalized to have unit length." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def powermethod(A, x, n):\n", " list=[x]\n", " for i in range(n):\n", " x = np.dot(A,x)\n", " x = 1/ np.linalg.norm(x)*x\n", " list.append(x)\n", " return list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 1: find $e_0$" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1 2 3 4]\n", " [ 2 5 6 7]\n", " [ 3 6 8 9]\n", " [ 4 7 9 10]]\n" ] } ], "source": [ "A = np.array([[1, 2, 3, 4], [2, 5, 6, 7], [3, 6, 8, 9], [4, 7, 9, 10]]);\n", "print(A)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "v0 = np.random.rand(4)\n", "list0 = powermethod(A,v0,10)\n", "e0 = list0[-1]" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "e0 = [0.22593827 0.44322186 0.57278788 0.6514755 ]\n", "lambda0 = 24.062535124396724\n", "Ae - lambda e = [ 1.77635684e-15 1.77635684e-15 1.77635684e-15 -1.77635684e-15]\n" ] } ], "source": [ "print(\"e0 =\",e0)\n", "Ae0 = np.dot(A,e0)\n", "lambda0 = np.dot(Ae0,e0)\n", "print(\"lambda0 = \",lambda0)\n", "print(\"Ae - lambda e = \", Ae0-lambda0*e0)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Step 2: Find $e_1$\n", "Now we will choose our second vector orthogonal to $e_0$. There's an easy way to do it: just project the $v_0$ onto $e_0$ and subtract it from $e_0$:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "v1 = v0 - np.dot(v0, e0)*e0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It's easy to see $\\langle v_1, e_0\\rangle = 0$ but for skeptics:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "8.326672684688674e-17" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.dot(v1, e0)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "list1 = powermethod(A, v1, 10)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "e1 = list1[-1]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "e1 = [-0.46990655 0.09173251 0.2408537 0.84425261]\n", "lambda1 = 8.838851111714085\n" ] } ], "source": [ "print(\"e1 =\",e1)\n", "Ae1 = np.dot(A,e1)\n", "lambda1 = np.dot(Ae1,e1)\n", "print(\"lambda1 = \",lambda1)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([7.96656407, 6.06292995, 6.53690852, 1.91048769])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.dot(A,e1)-lambda1*e1" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[array([0.12964817, 0.50939541, 0.11970682, 0.5283428 ]),\n", " array([0.22531894, 0.44945885, 0.5698766 , 0.64996829]),\n", " array([0.22582307, 0.44331308, 0.57274854, 0.65148796]),\n", " array([0.22593824, 0.44322506, 0.57278785, 0.65147336]),\n", " array([0.22593818, 0.44322189, 0.57278786, 0.65147552]),\n", " array([0.22593827, 0.44322186, 0.57278788, 0.65147549]),\n", " array([0.22593827, 0.44322186, 0.57278788, 0.6514755 ]),\n", " array([0.22593827, 0.44322186, 0.57278788, 0.6514755 ]),\n", " array([0.22593827, 0.44322186, 0.57278788, 0.6514755 ]),\n", " array([0.22593827, 0.44322186, 0.57278788, 0.6514755 ]),\n", " array([0.22593827, 0.44322186, 0.57278788, 0.6514755 ])]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list0" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[array([-0.02124169, 0.2133956 , -0.26282193, 0.09326357]),\n", " array([-0.08676131, 0.88334455, -0.40956771, -0.21078229]),\n", " array([-0.75479507, 0.59766194, -0.25769223, 0.08172716]),\n", " array([-0.00928794, 0.83118003, -0.00725835, -0.55587841]),\n", " array([-0.90652181, 0.31016397, -0.15576086, 0.24032286]),\n", " array([ 0.29749353, 0.69496676, 0.08336839, -0.64928307]),\n", " array([-0.89244903, 0.03389263, -0.15467586, 0.42244691]),\n", " array([ 0.52202852, 0.538352 , 0.1179475 , -0.65096216]),\n", " array([-0.82847502, -0.14404211, -0.15122788, 0.51962596]),\n", " array([ 0.63847108, 0.44325058, 0.14714083, -0.61174601]),\n", " array([-0.46990655, 0.09173251, 0.2408537 , 0.84425261])]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list1" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "list1 = powermethod(A, v1, 20)\n", "e1 = list1[-1]" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "e1 = [0.22593827 0.44322186 0.57278788 0.6514755 ]\n", "lambda1 = 24.062535124396724\n" ] }, { "data": { "text/plain": [ "array([ 3.99680289e-14, 1.77635684e-14, 8.88178420e-15, -3.37507799e-14])" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print(\"e1 =\",e1)\n", "Ae1 = np.dot(A,e1)\n", "lambda1 = np.dot(Ae1,e1)\n", "print(\"lambda1 = \",lambda1)\n", "Ae1 - lambda1*e1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The power method produces an approximation of $e_0$ again! But how is that possible? Aren't all the iterates of $v_1$ supposed to be orthogonal to $e_0$?\n", "\n", "Let's check\n" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "8.326672684688674e-17\n", "1.199040866595169e-14\n", "5.577205364204474e-13\n", "2.2103640098802835e-11\n", "8.141484153512124e-10\n", "2.8044365274482175e-08\n", "9.130398436751896e-07\n", "2.859411817360269e-05\n", "0.0008752537849501252\n", "0.02645634613643802\n", "0.6224559519737064\n", "0.9991204493074519\n", "0.9999990158421206\n", "0.9999999988986638\n", "0.9999999999987665\n", "0.9999999999999986\n", "0.9999999999999999\n", "1.0\n", "0.9999999999999999\n", "1.0\n", "1.0\n" ] } ], "source": [ "for v in list1:\n", " print(np.dot(v,e0))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see that the iterates of $v_0$ are definitely *not* orthogonal to $e_0$.\n", "\n", "### The problem\n", "\n", "There are two issues, both related to small errors that propogate quickly and overtake our method. It turns out our plan to find $e_1$ relied on a method that is too delicate.\n", "\n", "The first issue is that $e_0$ is an approximation of the principal eigenvector. Let's call the principal eigenvector $E_0$ and our approximation $e_0$. Our $v_1$ is orthogonal to $e_0$, but not quite orthogonal to $E_0$ since $e_0$ isn't quite equal to $E_0$. Therefore, the iterates won't necessarily be orthogonal to $E_0$. In fact, if say $\\langle v_1,E_0\\rangle = \\epsilon$, then the n-th iterate $$\\langle A^n v_1,E_0\\rangle =\\langle v_1,A^n E_0\\rangle =\\lambda_0^n \\epsilon.$$ In our case, $\\lambda_0=24.0625...$, so $\\lambda_0^n$ is quite large!\n", "\n", "The second issue is that even if $e_0$ exactly equalled the principal eigenvector, any roundoff errors made during the computation of the iterates, will produce a vector that won't be orthogonal to $e_0$. The outcome is the same as the first issue. A small error will propogate.\n", "\n", "So, what can we do to address these issues?\n", "\n", "### Solution 1\n", "\n", "One thing we can do is force each iterate to be orthogonal to $e_0$. I'll program this as follows:\n", "\n", "The input will be a matrix $A$, an initial vector $x$, a number of iteratations $n$, and an additional (normal) vector $e$ and then I'll modify the program by making sure that $Ax$ is orthogonal to e by subtracting the projection onto $e$:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "def powermethod2(A, x, n, e) :\n", " y = 1/np.linalg.norm(x)*x\n", " y = y - np.dot(y,e)*e\n", " list = [y]\n", " for i in range(n):\n", " y = np.dot(A,y)\n", " y = 1/np.linalg.norm(y) * y\n", " y = y - np.dot(y,e)*e\n", " list.append(y)\n", " return list" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "list1 = powermethod2(A,v1,10,e0)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "e1 = [-0.78009251 -0.23529347 -0.14780656 0.56057638]\n", "lambda1 = -0.7904876636474643\n" ] } ], "source": [ "e1=list1[-1]\n", "print(\"e1 =\",e1)\n", "Ae1 = np.dot(A,e1)\n", "lambda1 = np.dot(Ae1,e1)\n", "print(\"lambda1 = \",lambda1)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-0.06844714, 0.11454629, -0.00614274, -0.04879094])" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.dot(A,e1)-lambda1*e1" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "list1 = powermethod2(A,v1,50,e0)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "e1 = [-0.72531369 -0.3184697 -0.14246074 0.59346612]\n", "lambda1 = -0.8054849155764616\n" ] } ], "source": [ "e1=list1[-1]\n", "print(\"e1 =\",e1)\n", "Ae1 = np.dot(A,e1)\n", "lambda1 = np.dot(Ae1,e1)\n", "print(\"lambda1 = \",lambda1)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-3.39226286e-08, 4.69881009e-08, -3.53711606e-09, -1.70930746e-08])" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.dot(A,e1)-lambda1*e1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise \n", "Extend the powermethod2 function to include a list of orthonormal vectors $e=[e_0, \\ldots, e_k]$ and make sure at each iteration $y$ is orthogonal to all of them. See if you can find the eigenvectors $e_2, e_3$ and $\\lambda_2, \\lambda_3$ of the matrix $A$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.18" } }, "nbformat": 4, "nbformat_minor": 2 }