Below are some mistakes (or inaccuracies) I consider to need correcting, and other items I think require some elaboration, in the textbook Differential Equations and Boundary-Value Problems, 8th edition, by Zill and Wright. The abbreviation “Z&W” stands for this book. Throughout, “DE” means “ODE” (ordinary differential equation).
No disrespect is intended towards authors Zill and Wright. Writing a textbook of this length is an enormous undertaking, involving countless details, and it is virtually impossible to complete such a project with no errors. Textbook-authors also make pedagogical decisions geared towards the expected level of students who will be using the book. Sometimes a conscious decision may be made to simplify a definition at the expense of precision, to make the definition easier for the student to understand. Such decisions are bound not to please every instructor who teaches from the authors’ book. In addition, there are several things that I now call mistakes that I did not consider to be mistakes (or think about critically enough to discover problems with them) until I’d taught Elementary Differential Equations for many years.
Last updated Sun Apr 17 22:51 EDT 2016
- p. 6, paragraph “Explicit and Implicit Solutions”. This paragraph states that “You should be familiar with the terms explicit function and implicit function from your study of calculus.” This may be the case, but I hope it is not; if it is the case, then the misleading terminology “explicit function”, has gained unfortunate and undeserved popularity in recent years. There is no difference between “explicit function“, as the term is used in this book, and function. The “explicitness” of a function is in the eye of the beholder.
The term “implicit function” is acceptable (and has been used for longer than I’ve been alive) but is not great. Better terms for this concept are “implicitly-defined function” and “implicitly-determined function”.
- p. 7, Example 6. It is inaccurate to refer to the one-parameter family “y = cx – xcosx” as an explicit solution of the indicated DE. It is a one-parameter family of (explicit) solutions. The family contains infinitely many solutions, one for each c. “An explicit solution” means one solution.
- p. 8. (Students: it is okay if you do not fully understand this comment. The main take-away for you is simply that you should not use the term “singular solution” the way Z&W does. It is probably best if you simply ignore all references to “singular solution” in the book.) I consider Z&W’s definition of singular solution to be poor. One reason is that before this definition appeared in Z&W (it may be in other recent textbooks as well, but it isn’t in the ones I checked), the term singular solution already had a pre-existing, much more restrictive, definition, which you can find here:
https://www.encyclopediaofmath.org//index.php?title=Singular_solution. There is another, less restrictive definition at the start of the Wikipedia article on “singular solution”, but the Z&W definition is significantly different from this weaker definition as well. Solutions that meet the Z&W definition of “singular solution” need not be singular solutions in either the original (Encyclopedia of Mathematics) sense or the broadened sense defined in the Wikipedia article; the Z&W definition is an oversimplification of these precise definitions. But there is an even more important reason that the Z&W definition is (in my opinion) poor: it suffers from an intrinsic problem. If Z&W used the term “blorp” instead of “singular solution”, the intrinsic problem would be exactly the same. Regardless of what term is used for the object called a “singular solution” in this paragraph of Z&W, this object is what mathematicians call not well-defined; the property “a solution fits as a member of a 1-parameter family” can be in the eye of the beholder (at least if the parameters are real numbers, as they are in Z&W). For many DEs, the set of all solutions can be written in more than one way. Using Z&W’s definition, a solution can be singular according to one valid way of writing the set of all solutions, but non-singular according to another valid way of writing the set of all solutions. The original definition (the one on the Encyclopedia of Mathematics page cited above) has no such defect; it allows one and only one correct answer to the question “Is this function a singular solution of this DE?” The same is true of the Wikipedia definition.The problem introduced on p. 8 is greatly exacerbated on pp. 48–49, in the paragraph “Losing a solution” and in Example 3, both of which reinforce a false fact. A constant solution \( y=r\) missed by separating variables in the equation \( \frac{dy}{dx}=g(x)h(y)\) is never a (correctly defined) singular solution if \(h\) is continuously differentiable at \(r\). Even were we to allow the Z&W (re)definition of “singular solution,” Example 3 redone more carefully highlights the intrinsic problem with the Z&W definition: it can lead to self-contradictory conclusions. In working this example, the authors draw the conclusion that the solution \( y=-2\) is singular but that \( y=2\) is not. This conclusion is enabled by the failure to observe that in the set of solutions found by separation of variables, the constant \(c\) in equation (6) is not an arbitrary real number; it is an arbitrary nonzero real number. This set of solutions is exactly the same as the set given by equation (6) with \(c\) replaced by \(\frac{1}{c}\). With this replacement, in the formula labeled by \(c\) can be written as \(y=2\frac{c+e^{4x}}{c-e^{4x}}\). Writing this set of solutions this way makes it appear that the constant solution \( y=-2\) is part of a 1-parameter family (setting \(c=0\)), but that \( y=2\) is not. By the logic used in the book, \( y=2\) is therefore a singular solution while \( y=-2\) is not–exactly the opposite of the conclusion drawn in the book by the very same logic. This illustrates what I said above: the Z&W definition of “singular solution” is not a consistent definition of anything. If we get around the fact that “singular solution” already has an established meaning different from Z&W’s, using the word “blorp” for what Z&W call a singular solution, then we find that for some DEs there are solutions that are simultaneously blorps and not blorps.
With the correct definition of “singular solution”, neither \( y=2\) nor \( y=-2\) is a singular solution of the DE in this example.
For another problem with (this) Example 3, see the item “p. 49, line 2” below.
- p. 10, definition of “general solution” in the gray box. See “p. 61, instructions for problems 1-24, continued” below.
- p. 18, problem 30a. The statement that “\(y=\tan(x+c)\) is a one-parameter family of solutions of the DE “\(y’=1+y^2\)” is inaccurate. For every value of \(c\), the equation “\(y=\tan(x+c)\)” defines infinitely many solutions of the given DE, each having a (maximal) domain-interval of width \(\pi.\) For example, the equation “\(y=\tan(x)\)” defines one solution on the interval \((-\pi/2, \pi/2),\) another on \((\pi/2, 3\pi/2),\), another on \((-101\pi/2, -99\pi/2),\), etc. A correct statement is that \(y=\tan(x+c)\) is an infinite collection of one-parameter families of solutions of \(y’=1+y^2\). This collection can be indexed by an integer n. The nth family is the family of solutions \(y=\tan(x+c)\)) on the domain-interval \(((n-\frac{1}{2})\pi-c, (n+\frac{1}{2})\pi-c).\)
- p. 18, problem 31a. The statement that “\(y= -1/(x+c)\)” is a one-parameter family of solutions of the DE \(y’=y^2\) is inaccurate. For every value of \(c\), the equation “\( y= -1/(x+c)\)” defines two solutions of the given DE, one having (maximal) domain-interval \((-\infty,-c)\), the other having (maximal) domain-interval \((-c,\infty)\). A correct statement is that “\(y= -1/(x+c)\)” is a collection of two one-parameter families of solutions of \(y’=y^2\).
- p. 18, problem 32ab. Typo: Either the intended \(x\)-values or intended the \(y\)-values of initial conditions in (a) and (b) were reversed. The initial condition in (a) should be either \(y(1)=-1\) or \(y(3)=1;\) the initial condition in (b) should be whichever of these initial conditions isn’t used in (a).
- p. 49, line 2. While the equation (6) can reasonably be said to represent a 1-parameter family of functions, it is incorrect to say that equation (6) represents a 1-parameter family of solutions of a DE. In a 1-parameter family of solutions, we would have exactly one solution for each value of the parameter. But for each \(c>0\), equation (6) defines two solutions, one on the interval \((-\infty, \frac{1}{4}\ln\frac{1}{c})\), and the other on \((\frac{1}{4}\ln\frac{1}{c}, \infty).\)
For another problem with (this) Example 3, see the item “p. 8” above.
- p. 55 and later: The notation “\(\int P(x)dx + c_1\)” for the general antiderivative of \(P(x)\) is misleading unless some comment is made. By definition, “\(\int P(x)dx\)” itself is the collection of all antiderivatives of \(P(x)\); the “\(+c_1\)” is redundant. The constant of integration is absorbed into the notation “\(\int P(x)dx\)”; if \(F(x)\) is any specific antiderivative of \(P(x)\) on an interval \(I\), then, on \(I\), \(\int P(x)dx =F(x) + C\), where \(C \) is an arbitrary constant. Note that Z&W does not make this mistake on p. 47 (equations (3) and (4)), on p. 48 (Example 2), or on p. 49 (Example 4)
However, we frequently find (especially when solving first-order linear differential equations) that we would like to refer to one antiderivative of a function, not the collection of all of them at once. Many authors and instructors take the simplest route in this context, writing “\(\int P(x)dx\)” when they mean one specific (even if arbitrary) antiderivative of a function. This is an example of something that mathematicians call “abuse of notation”, using notation in a not-quite-correct way in the interest of simplification. This particular abuse of notation is not such a terrible choice, if the authors and instructors explicitly tell students that the notation they’re using here is not the (correct) notation that students learned in Calculus 1; they are redefining the notation, in this context, for the sake of simplicity.
My own notation for “specific, arbitrary antiderivative of \(P(x)\)” is \(\int_{\rm spec} P(x)dx\), which is rather clunky but does not involve abuse of notation.
- p. 57, definition of “general solution”. See “p. 61, instructions for problems 1-24, continued” below.
- p. 61, instructions for problems 1-24. The second sentence of the instructions, “Give the largest interval \(I\) over which the general solution is defined” is analogous to saying “Tell me the class you’re taking” to someone who may be taking more than one class. For several of the DEs in problems 1-24, there is more than one largest interval over which a general solution is defined, so this instruction needs clarification. The authors may have meant either
- (a) “Give the largest interval(s) \(I\) over which general solution(s) is/are defined,” or
- (b) “Give a largest interval \(I\) over which a general solution is defined.”
The answers in the back of the book are correct if the instructions are modified as in (b). If the instructions are modified as in (a), then several answers in the back of the book become incorrect.
Note that without any such correction, even the first sentence of the instructions, “In Problems 1-24, find the general solution of the differential equation,” does not make sense for many of the problems. It would make sense if the definition of “general solution” were taken to be the one I prefer–the set of all solutions (not counting restrictions of solutions from a larger domain to a smaller one)–but this is not the way Z&W defines “general solution”.
- p. 61, instructions for problems 1-24, continued. Another problem with these instructions stems from the book’s definitions of “general solution” on p. 57 and p. 10:
- On p. 57, the book defines “general solution” only for first-order linear DEs that are in standard form (equation (2) on p. 54, not equation (1)). Several of the DEs in these exercises are not in standard form, so the definition on p. 57 doesn’t apply. (Some of the DEs are in “differential form”, with no specification of dependent variable, making matters worse insofar as the definition of “general solution” is concerned. And the word “linear” should not even be applied to DEs in differential form. While I want students to be able to pass between a DE in differential form and its derivative-form relatives, I do not want them to think that these types of equations are the same thing–even though I don’t expect students in this course to fully understand what type of animal a DE in differential form actually is.) The authors do not make clear whether, by “general solution” of a DE of the form in equation (1) on p. 54, they mean “general solution of the DE you get after dividing through by \( a_1(x)\).” This appears to be what the authors mean, but it is not consistent with the definition of “general solution” on p. 10 (a definition which has its own problems). A linear first-order DE is any DE of the form in equation (1) on p. 54 (modulo subtracting terms from both sides of the equation). Such an equation may have solutions, even whole families of solutions, on intervals that are larger than the largest intervals on which the functions \( P\) and \( f\) in equation (2) that you’d get from dividing equation (1) by \( a_1(x)\) are defined. For example, in problem 17, the book’s answer gives \( (-\pi/2,\pi/2)\) as “the” largest interval over which “the” general solution is defined. However, the given formula actually gives the general solution (in the sense of p. 10) on the entire interval \( (-\infty,\infty)\).
- In problem 15, the associated derivative-form equation with \( x\) as independent variable is linear. For this derivative-form equation, there are two largest intervals on which the set of all solutions is a 1-parameter family. The book’s answer gives only the interval \( (0,\infty);\) the other is \( (-\infty,0) \). However, for this DE, there is a perfectly good set of solutions on every open interval, and the largest such interval is the whole real line, \( (-\infty,\infty)\). The set of solutions of this DE on \( (-\infty,\infty)\) (or on any open interval containing 0) is
the two-parameter family$$\left\{ x(y)= \left\{ \begin{array}{ll} 2y^6+c_1 y^4 &\mbox{if}\ y\geq 0, \\ 2y^6+c_2 y^4 &\mbox{if}\ y< 0,\end{array}\right. \ \ c_1, c_2\ \mbox{arbitrary constants}\ \ \ \right\}.$$
It is not clear whether Zill and Wright would call this "the general solution on \( (-\infty,\infty)\)", or would say that this DE does not have a general solution on \( (-\infty,\infty)\). (The definition on p. 10 of “general solution on an interval \(I\)”, applied to a first-order DE, disallows the term “general solution on \(I\)” if the set of all solutions is a family with more than one parameter.)
- p. 67, Example 3. There is no real mistake in this example. It is actually a good example illustrating that there are first-order DEs that are neither separable, linear, or exact, but that students will still be expected to be able to solve by combining their (expected) algebra skills with the methods for these special categories of DE. The method in Example 3 produces a differential-form DE that is nearly equivalent to the original derivative-form DE, and from which all the solutions of the derivative-form DE can be recovered. But the wording of Z&W’s solution-method could easily mislead a student about what “exact DE” means. (See “p. 69, instructions for exercises 1-24” below.) The original DE is not exact, and “clearing denominators and putting everything on one side of the equation” is a technique that is sometimes helpful and sometimes useless. It is a good tool for students to have in their arsenal, but even when we are able to produce an exact equation by such tools, that doesn’t mean that the equation we started with was exact.
Regarding the “nearly equivalent” above: the differential-form DE written on line of “SOLUTION” has solutions \( x(y)\equiv 1\) and \( x(y)\equiv -1\), which are not solutions of the original DE. However, no “\(x\)-equals-constant-function-of-\(y\)” relation can be a solution of a derivative-form DE in which \(x\) is the independent variable can be solutions to the original DE, since the original DE (1) had \(x\) as the independent variable, and (2) was defined only on regions of the \(xy\) plane in which \(x\neq\pm 1\) and \(y\neq 0\). The subset of the solutions of the differential-form equation in which \(x\) is the independent variable, happens to be the set of all solutions of the original derivative-form DE. This is why there is no actual mistake in this example, but there are other examples in which algebraic manipulations of the type done here can introduce spurious solutions or lose true solutions.
- p. 69, instructions for exercises 1-20. These instructions overlook the fact that the definition of “exact equation” is (and should be) extremely limited. A DE is not exact unless (1) it is in differential form, and (2) it has an exact differential on one side of the equation, and zero on the other. This is the totality of DEs that should be called exact. An equation that is merely equivalent to an exact equation should not be called exact. Every DE in differential form, on a region in which the differential is nowhere zero, is locally equivalent to an exact DE (though finding an integrating factor may be a practical impossibility).
The book’s Definition 2.4.1 on p. 64 is correct (except that on the next-to-last line, it should say “exact equation on R“, and the last sentence should have the words “in R” or “on R” inserted at the end). The only other equations that are exact are those of the form “0= (exact differential)” instead of “(exact differential) = 0”. It is good for students to get practice with non-exact DEs that can “easily be turned into” exact ones, and then solved. (Here I’m intentionally being vague about what “easily” and “turning one equation into another” mean; a precise statement would require a long digression not suitable for this page.) Several of the DEs in 1-20 are of this type, and so is the DE in Example 3, p. 67 (but there’s a trick in this example that’s not represented in exercises 1-20). Not every first-order DE students are expected to solve will be separable, linear, or exact. However, do not be misled by the discussion in Example 3. The original DE is not exact.
Students: you may ignore most of this paragraph; stop reading when you come to a word whose mathematical meaning you don’t know. This paragraph is really addressed to instructors. The terminology “exact DE”, though completely established in the literature long before I was born, is intrinsically flawed. In most areas of mathematics, we regard two equations as being “essentially the same” if they are equivalent, i.e. if they have the same set of solutions. Thus we generally apply the same adjectives to equivalent equations. This fails for “exact DE”, since almost every DE of the form \( P(x,y)dx+Q(x,y)dy=0\) that we call “not exact” is equivalent (at least locally) to an exact equation. (Every nonvanishing differential has an integrating factor, at least locally, and often globally.) Most of the differential-form DEs that Z&W (and probably other textbooks) incorrectly label as exact are equivalent to exact equations in only a very restricted sense: the operations generating the equivalence-relation are addition and subtraction of differentials. (Multiplication by nonzero constants can also be allowed, but does not make any difference as far as the exact/non-exact labeling goes.) Restricting to these operations would make sense if we were introducing the space of continuous 1-forms on \( {\bf R}^2\) as an example of an (infinite-dimensional) vector space, and explicitly not wanting to mention that this space is also a module over the algebra of continuous functions on \( {\bf R}^2\). But in a course advanced enough to cover differential forms (in the sense of 1-forms, 2-forms, etc., not in the sense of “DE in differential form”), we would make use of the fact that the space of continuous (or smooth) p-forms is an algebra over the space of continuous (or smooth) real-valued functions; there would be no motivation for considering only the vector-space operations. So restricting to these operations in a course at the level of Elementary Differential Equations cannot be justified by saying, “When dealing with 1-forms on \( {\bf R}^2\), we view the space of 1-forms only as a vector space, with no other algebraic structure;” there is never any motivation for doing this. Thus, the only reason I see for treating two DEs in differential form as inequivalent (for the “exactness” labeling) if one can’t be obtained from the other by addition/subtraction of differentials and/or multiplication by nonzero constants, is that we expect all our students to be able to solve a DE “\(P\,dx=Q\,dy\)” if \(P\,dx-Q\,dy\) is an exact differential, but generally don’t expect them to be able to solve “\(P\,dx=Q\,dy\)” if \(P\,dx-Q\,dy\) is not exact. In my view, “Students should be able to solve \(P\,dx-Q\,dy =0\) whenever \(P\,dx-Q\,dy\) is an exact differential” is not adequate justification for calling “\(P\,dx=Q\,dy\)” an exact equation whenever \(P\,dx-Q\,dy=0\) is an exact differential–especially since most other DEs that are merely equivalent to an exact DE are not exact. (For example, if “\(P\,dx+Q\,dy=0\)” is exact, then “\(e^xP\,dx+e^xQ\,dy=0\)” is never exact unless \( Q\equiv 0\).)
- p. 69, problems 1-20. WARNING FOR ANYONE USING THIS BOOK’S ONLINE MATERIALS: Last year, when I examined the online materials for this section of the book, I found exercises of the form “Determine whether this equation is exact” for which the answers were wrong. Specifically, these were DEs of the form
\(P(x,y)dx=Q(x,y)dy\) in which \( P(x,y), Q(x,y)\) were such that the DE \(P(x,y)dx-Q(x,y)dy=0\) was exact. The correct answer, “not exact” (automatically true because the DE has a nonzero expression on each side of the equation) was counted as wrong by the software. I communicated indirectly with one of the authors through a Cengage representative, and was told that this issue will be fixed in the next edition. - p. 70, problem 44. This is a “Discussion Problem”, so of course there literally can be no mistake. But anyone assigning or discussing this problem should be aware of the following:
- The equation \(\frac{dy}{dx} = g(x)h(y)\) is automatically not exact, since it is not in differential form.
- The equations \(dy = g(x)h(y)dx\) and \(\frac{1}{h(y)}dy = g(x)dx\) are also automatically not exact (unless \(g(x)\equiv 0\)), since each of these equations has a nonzero differential on each side of the equation.
- The equation \(\frac{1}{h(y)}dy -g(x)dx=0 \) is exact. However, it is not equivalent to the original derivative-form DE unless \(h\) is a function that is nowhere zero. As with all separable equations, all constant solutions \( y\equiv r\), where \( h(r)=0\), are lost when we divide by \( h(y)\).
- pp. 143-146 (part of Section 4.4). For the instances in which a “naive” guess for a particular solution does not work, there is a great deal of unfortunate wording on these pages–wording that is imprecise and/or potentially misleading. Examples of such wording, with the most troublesome words italicized, include:
- p. 143, line 4: “… \( Ae^x\) is already present in \(y_c\).” Problem: the implicit assumption that the notion “one function is present in another” is well-defined (i.e. can be given precise, unambiguous meaning).
- p. 143, sentence following the subsection-heading “Case I“: “No function in the assumed particular solution …”. Problem (essentially the same as the one above): the implicit assumption that the notion “one function is in another” is well-defined.
- p. 143, 2nd paragraph of “Case I“: “… no function in the assumed particular solution \(y_p\) is duplicated by a function in the complementary function \(y_c\).” Problem: because of the problem mentioned above with “in“, the notion of what constitutes “duplication” is not well-defined.
- p. 143, last sentence: “Note that there is no duplication between the terms in \(y_p\) and the terms in the complementary function \( y_c = e^{4x}(c_1\cos 3x + c_2\sin 3x).\)” Main problem: the word “terms”. There is no such thing as a term of a general function. “Term” of a general function is in the eye of the beholder. The same function can be written more than one way, and what you might call a “term” when the function is written one way might not be what you’d call a term when same the function is written another way. (This is precisely why, every year, new students reinvent the “terrible method for solving exact equations“.) For this reason, the notion of what constitutes “duplication” is again not well-defined.
In specific contexts, we can talk about terms of an expression–not of a function–because “expression” incorporates the way we have chosen to write something down.
- p. 144, lines 5-6: “… there is no duplication of terms between \(y_p\) and \(y_c=c_1\cos 2x + c_2\sin 2x.\)”
- p. 144, “Form Rule for Case I“: “The form of \( y_p \) is a linear combination of all linearly independent functions that are generated by repeated differentiation of \(g(x)\).” Problems: (i) There is no such thing as “all linearly independent functions” obtained by whatever method. This imprecision is related to the informal usage of the plural “linearly independent functions” instead of the precise, singular, “linearly independent set of functions.” (ii) The notion of functions “generated by” repeated differentiation of another function is ambiguous. It can be given precise meaning, but that meaning is not directly the same as the meaning intended in this Form Rule. The fact that it ends up being equivalent to what is intended in this Form Rule (after fixing the “all linearly independent function” wording) can mislead students into misunderstanding what “generated by” actually means.
- p. 144, Case II: “A function in the assumed particular solution …”
- p. 145, “Multiplication Rule for Case II“: “If any \(y_{p_i}\) contains terms that duplicate terms in \(y_c\) …”. Problems: (i) the word “contains” (similar to problems mentioned above, with one function being “in” another) and the word “in”; (ii) the word “terms”; (iii) the word “duplicate”.
Most, if not all, of these instances–would not be so bad in the classroom, where the instructor can point to expressions on the blackboard and explain what is meant in this context by “term”, by one function being “contained in” another, etc. The instructor can also put imprecise or context-dependent wording in quotation marks. However, using this terminology without taking appropriate care (especially in a textbook, where students will take the printed word to carry authority) can lead to misunderstanding by students, not just in the present course but in subsequent courses (and in the use of mathematics outside of courses).
Some examples illustrating potential problems caused by the book’s wording:
- Suppose the equation to solve is \({y’}’-y=e^x\). Remember that there is no such thing as the fundamental set of solutions of the associated homogeneous DE; there are infinitely many such FSS’s. Suppose that you choose to use the FSS \(\{\cosh x, \sinh x\}\) (using the informal convention of identifying function by their output when the input variable—\(x\) in this case—has been named). You then write the “complementary” solution—the general solution of the associated homogeneous DE—as a linear combination of the functions in this FSS: \(y_c =c_1\cosh x + c_2\sinh x\). Suppose that you now guess \(y_p=A e^x\). You observe that your \(y_p\) is a solution of the associated homogeneous DE, so you know you’re not in Case I. You have made no mistakes whatsoever. You look at the “Multiplication Rule for Case II” on p. 144. Question: does the “\(e^x\)” in your initially-guessed \(y_p\) “duplicate a term in \(y_c\)”? Your \(e^x\) is neither \(\cosh x\) nor \(\sinh x\), so you certainly haven’t duplicated these. However, \(e^x =\cosh x + \sinh x\), another way of seeing that your \(y_p\) is one of the solutions that comprise the general solution of the associated homogeneous DE. But is \(\cosh x + \sinh x\) a term of \(y_c =c_1\cosh x + c_2\sinh x\)? No, not by any conventional definition of “term of an expression”. So, do you multiply your \(y_p\) by anything? The answer is yes, but you cannot logically deduce this fact from the book’s statement of “Multiplication Rule for Case II“.
- Look at the end of the last sentence on p. 143, which writes \(y_c=e^{4x}(c_1\cos 3x + c_2\sin 3x).\) Ask the following questions: (i) Is \(e^{4x}\) a term of \(y_c\)? (ii) Is \(\cos 3x\) a term of \(y_c\)? (iii) Is \(e^{4x}\cos 3x\) a term of \(y_c\)? To use the Method of Undetermined Coefficients correctly, you have to give the answer “no” to each of the first two questions, and “yes” to the third. But in the given expression for \(y_c\), you see the expressions \(e^{4x}\) and \(\cos 3x\) explicitly, while nowhere do you see \(e^{4x}\cos 3x\) explicitly. By usual conventions, \(e^{4x}\cos 3x\) is not a term of the expression \(e^{4x}(c_1\cos 3x + c_2\sin 3x).\) Had \(y_c\) been written equivalently as \(y_c=c_1 e^{4x}\cos 3x + c_2 e^{4x}\sin 3x,\) then \(c_1 e^{4x}\cos 3x\) would have been a term of this expression for \(y_c\). In this discussion, we have written the same set of functions two different ways (two different expressions). We can meaningfully identify \(c_1 e^{4x}\cos 3x\) as a term of this other expression for \(y_c\), and (with only minor inaccuracy) identify \(e^{4x}\cos 3x\), without the “\(c_1\)”, as a term of the same expression. But without taking a great deal more care than has been taken in this book, we cannot identify \(e^{4x}\cos 3x\) as a “term of \( y_c\)”, given the book’s expression for this \( y_c\).
Some recommendations:
- Everywhere in this book-section where terminology like “one function is in another,” one function is a “term” of another, etc., put this terminology in quotation marks. Ask your instructor how to say with precise terminology what the book’s authors meant (if your instructor has not already done so).
- If your instructor has not already done so, ask him or her to reformulate “Form Rule for Case I” precisely. There are several ways of doing this.
- If your instructor has not already done so, ask him or her to reformulate the “Multiplication Rule for Case I” precisely. There are several ways of doing this, all of which involve multiplying certain functions by a non-negative power of the independent variable. The reformulation that’s probably closest to what’s in this book is to replace “where n is the smallest positive integer that eliminates that duplication” by “where n is the smallest positive integer such that \(x^n y_{p_i}\) is not a solution of the associated homogeneous DE.”
Note to instructors: If \(X\) is a nonempty set and \(F\) is a (vector) subspace of the space of all real-valued functions on \(X\), and we are given a specified basis of \(F\), say \(\{f_i\}_{i\in I}\) (\(I\) some index-set), then of course we can define what we mean by “term” of function in this space. Every element \(f\)of \(F\) is uniquely a linear combination \(\sum_i c_i f_i\), where \(c_i=0\) for all but finitely many \(i\). For each \(i\in I\), \(c_i(f)f_i\) (coefficient included) is a term of \( f\). The space \(F_{\rm MUC}\) of functions \({\bf R}\to{\bf R}\) (or \( J\to {\bf R}\), where \(J\subset {\bf R} \) is any positive-length interval) to which the Method of Undetermined Coefficient is applicable can easily be given a basis, for example \(\{{\bf c}_{n,\alpha,\beta}: n\in {\bf Z}, \alpha,\beta\in {\bf R}, n\geq 0, \beta\geq 0\} \cup \{{\bf s}_{n,\alpha,\beta}: n\in {\bf Z}, \alpha,\beta\in {\bf R}, n\geq 0, \beta> 0\}\), where \({\bf c}_{n,\alpha,\beta}(x)=x^n e^{\alpha x} \cos(\beta x)\) and \({\bf s}_{n,\alpha,\beta}(x)=x^n e^{\alpha x} \sin(\beta x)\). The trouble is that Z&W does not specify which basis of \(F_{\rm MUC}\) to use, and does not give the usual table showing which \(y_p\)’s to associate with which \(g\)’s. Instead, a substitute for this table is provided through wording that cannot help but be ambiguous because no basis of the relevant function-space has been specified. Implicitly, Z&W seems to be assuming that the basis of \(F_{\rm MUC}\) above is unique (up to multiplying each basis-element by a nonzero constant), or “automatic”, despite having all but pointed out in problem 11 on p. 128 that the solution-space of \({y’}’-y=0\) does not have a unique basis.
- p. 148, exercise 33. Answer in back of book is correct if you assume \(\omega\neq 0\); otherwise answer in back of book makes no sense. Presumably, authors meant student to assume \(\omega\neq 0\). (Note to instructors: Of course, the answer in the back of the book does have the correct pointwise limit as \(\omega\to 0\), since, under mild hypotheses that are abundantly satisfied here, solutions of nth-order IVPs on fixed compact intervals \(I\) are continuous with respect to parameters [in the \(C^n(I)\) topology or anything weaker, e.g. the uniform topology].)
- p. 148, exercise 34. It is not clear whether the authors intended students to assume \(\gamma\neq\omega,\) or wanted students to realize that the cases \(\gamma\neq\omega\) and \(\gamma=\omega\) must be considered separately.
- p. 161, Remark (ii). The Remark says, “[D]o not hesitate to simplify the form of \(y_p\).” However, in this book and all others I’ve seen, no attempt is made to define “form of a particular solution” for solutions produced by any method other than the Method of Undetermined Coefficients. (There is good reason for this restriction. A particular solution is any single solution. The set of functions that can arise as solutions of DEs is essentially the set of all differentiable functions. This is such a large collection of functions that no useful meaning can be attached to the phrase “form of a solution” that could apply to all functions in this collection.) Based on the example given in this remark, what the authors seem to mean is, “[D]o not hesitate to simplify your formula for the general solution \(y_c+y_p\).” This is a valid instruction, but a better instruction is the more emphatic, “Always simplify your formula for the general solution as much as possible.”
This is a special case of a more general instruction that should be stressed in all classes at all levels (starting with fractions in elementary school), not just in college: “Always simplify your final answer to any math problem as much as possible.” Thus, to students who have been taught sufficiently well in the past, the special case above for Variation of Parameters is already known implicitly. Nonetheless, when teaching this topic I have always found that most students need to be reminded to simplify, so I agree with Zill and Wright that a remark along these lines should be made.
Of course, there may not be a unique simplest way to write an answer. (The words “of course” here are for the sake of instructors.) But there are formulas that are definitely not simplest, and these should always be simplified. For example, for some given DE, it may be true that “\(y=x-0+x\)” is a solution, but it is clear that the student writing his or her answer this way, instead of as “\( y=2x\)”, has not fully grasped something.
- p. 161, exercise 9. The answer in the back of the book is incomplete and misleading. This DE can be considered on two maximal intervals: \((0,\infty)\) and \((-\infty,0)\). The book’s answer is correct only on the interval \((0,\infty)\) (because of the “\( x_0>0\)” restriction given in the answer), and on this interval it is misleading, because the absolute-value symbols in “\( \ln|x|\)” are unnecessary.
Superfluous absolute-value symbols inside a logarithm can do actual harm to students, because they reinforce a common misunderstanding among students: that when log (or ln) of some quantity is obtained by integrating something, the expression inside the log needs absolute-value symbols, regardless of the expression or the point(s) at which it is being evaluated. Thus, students will write “\(\ln|2|\)” or “\(\ln|x^2+1|\)” because they are not aware that “\(\ln 2\)” or “\(\ln(x^2+1)\)” would be equally correct. The latter expressions are not just equally correct, but preferable, of course, because they are simpler and indicate a greater understanding of “ln” as a function, and that “ln|(whatever)|” is not simply a string of symbols that you’re supposed to use when doing certain integrals.
- p. 163, “Note”. The words “Hence” and “we” in the second sentence are misleading:
- Problem with “Hence”: The wish to guarantee that Theorem 4.1.1 applies is, by itself, justification only for focusing attention on finding general solutions on \((0,\infty)\) and on \((-\infty,0)\). Reasons for focusing first on the single interval \((0,\infty)\) are that (i) examining the solution-methods on one of the intervals first, rather than both at the same time, provides concreteness, (ii) the interval \((0,\infty)\) is the simpler interval to handle first, and (iii) Cauchy-Euler equations on \((-\infty,0)\) can be reduced to Cauchy-Euler equations on \((0,\infty)\) by a simple change of independent variable, \(z=-x\) (where \(x\) is the original independent variable).
- Problem with “we” (in “we focus”): It is ambiguous whether this “we” is intended to mean authors Zill and Wright, or to mean something more general like “mathematicians and their students”. It is not clear to me which meaning the authors had in mind, but the sentence becomes false if “we” is given a meaning more general than “authors Zill and Wright”. It is important that this Note not be interpreted as an instruction that the student, or anyone else, should focus on the interval \((0,\infty)\) whenever faced with a Cauchy-Euler equation, or that there is any consensus among mathematicians that the domain \((0,\infty)\) is assumed unless otherwise specified.
- p. 168, exercises 1–24. Either the instructions should say explicitly to solve these DEs on the interval \((0,\infty)\), or the answers in the back of the book should be corrected so as to be valid on both \((0,\infty)\) and \((-\infty,0)\). As mentioned above in the comment on the p. 163 “Note”, there is no consensus (or convention) among mathematicians that in a Cauchy-Euler equation, the domain \((0,\infty)\) is assumed unless otherwise specified. Even if the authors did not mean to imply such a convention, but intended for the Note on p. 163 to mean only that the remainder of Section 4.7, up until “Solutions for \( x < 0 \)", will focus on the interval \(\{x > 0\} \), this would not eliminate the need to say in the exercises, “Consider only the interval \(\{x > 0\} \) unless otherwise instructed.”
- p. 238, instructions for Problems 23–24 and 25–30. The instructions for 23–24 should be changed to “… so that the summation index is the power of \(x\) in each term.” The instructions for 25–30 should be changed to “… in which the summation index is the power of \(x\) in each term.”
The phrase “[the] general term involves \(x^k\)” in the current instructions is an ambiguous way of saying what the authors intended: that the general term is a constant times \(x^k\). In, say, the power series \(\sum_{k=0}^\infty \frac{1}{k!}x^{k+2}\), it would not be incorrect to say that the general term involves \(x^k\), since \(x^{k+2}=x^2 x^k\).
The instructions for 23–30 could, of course, be made unambiguous by simply replacing “involves \(x^k\)” with “is a constant times \(x^k\).” But obviously it does not matter what letter is used for the summation index, so instructing students to use a particular letter is a questionable idea, even if this is the letter the authors chose to use in every relevant example in this section of the book.
- p. 239, Definition 6.2.1.
- Typo: in the first sentence, the words “of the differential” are repeated twice; one of these occurrences should be deleted.
- The definition of “singular point” is careless. For example, by this definition, for the differential equation \( y’ +\sqrt{x}\,y=0\), every point on the negative real axis is a singular point. (A function cannot be analytic at a point \(x_0\) if it is not even defined at \(x_0\).) Conventionally, for this DE, no negative number is even considered a candidate for the “ordinary point”/”singular point” classification. For example, calling -3 a singular point of this DE would be just plain silly. But Definition 6.2.1 would attach this label nonetheless, since -3 is not an ordinary point of the DE.
In fact, for this DE, even the point \(0\) would not conventionally be considered a candidate for this classification, because \(\sqrt{x}\) is defined on only one side of this point. The terms “ordinary point” and “singular point” are generally not used in the context of an arbitrary homogeneous linear differential equation \(a_n(x)\frac{d^n y}{dx^n} + a_{n-1}\frac{d^{n-1}y}{dx^{n-1}} +\dots + a_0(x)y=0\) and point \(x_0\). For such an equation and point, neither the term “ordinary point” nor “singular point” is usually used unless the coefficient-functions \(a_j\) are of a certain type near \(x_0\), a type that (among other things) requires the functions to be defined at every point of some open interval \((x_0-\delta, x_0+\delta)\) except perhaps at \(x_0\) itself. (Here \(\delta\) > \(0\).)
Note to instructors: The “of a certain type” functions above are those that are real-axis restrictions of functions that are meromorphic in some open neighborhood of \(x_0\) in \( {\bf C}\).
- p. 239, Example 2(a). This is an unconventional, and potentially misleading, illustration of what “singular point of an ODE” means. See the comment on Definition 6.2.1 above. Since \(\ln x\) is not defined for any \(x\) < \(0\), "\(0\)" is simply not a candidate (conventionally) for being called either an ordinary point or a singular point of the ODE \( {y’}’ +xy’ +(\ln x)y=0.\)
Example 2(b) illustrates the conventional meaning of “singular point of a (linear, homogeneous) ODE”.
- p. 240, colored italicized sentence on lines 2–3. As stated, this sentence is false. What it should say is, “If \(a_2(x), a_1(x), a_0(x)\) are polynomials with no common factors, then \(x_0\) is an ordinary point of (1) if \(a_2(x_0)\neq 0\), and is a singular point of (1) if \(a_2(x_0)=0.\)”
It is clear from the paragraph at the bottom of p. 239 that the authors had these added hypotheses in mind for the sentence on p. 240. However, saying that “We will primarily be interested in the case when the coefficients \(a_2(x), a_1(x), a_0(x)\) are polynomials with no common factors” is a far cry from saying, “Until further notice, we assume that the coefficients \(a_2(x), a_1(x), a_0(x)\) are polynomials with no common factors.” In addition, the fact that this implicit hypothesis is made in un-emphasized text and on a different page from the one on which p. 240’s emphasized sentence (stated without any qualifiers) appears, increases the likelihood that students will not realize that this sentence has an invisible hypothesis. This is especially true if students are reviewing the material in the book, very reasonably taking emphasized statements at face-value instead of looking to see whether there are hypotheses buried elsewhere in the text.
Adding to this potential for misunderstanding is that when the “polynomials with no common factors” assumption is removed (Example 8, p. 245), the student receives no clue that had the DE not been presented in standard form, the non-polynomial nature of a coefficient could potentially have changed the analysis of whether \(0\) is an ordinary point. Had the equation in Example 8 instead been \(x{y’}’ + (\cos x -1)y=0\) or \(x^2{y’}’ + (\cos x -1)y=0\), the number \(0\) would still have been an ordinary point, despite meeting the singular-point criterion at the top of p. 240. But had the equation been \(x^3{y’}’ + (\cos x -1)y=0\), the number \(0\) would have been a singular point. Unfortunately, the only place Z&W touches on the relevant issue for non-polynomial coefficients is Discussion Problem 27 on p. 247.
- p. 247, Discussion Problem 28. See “p. 239, Definition 6.2.1” above.
- p. 274, Definition 7.1.1. The notation on the left-hand side of equation (2) makes no sense, because “\(s\)” does not appear. Correct notation for the left-hand side of equation (2) is \({\mathcal L}\{f\}(s)\). (The notation \({\mathcal L}\{f(t)\}(s)\) would be poor, but not as bad as just \({\mathcal L}\{f(t)\}\).)
- p. 275, line 4 (and most, if not all, formulas involving \({\mathcal L}\{\mbox{something}\}\) in Chapter 7). The notation on line 4 is poor, for the two reasons mentioned in the comment on Definition 7.1.1:
- When we are writing the definition of some specific function, say \(f\), we write a formula like “\(f(x)=x^2\)”; we do not write the nonsensical “\(f=x^2\).” In each of the equations displayed in blue on line 4, since the letter \(s\) appears as the domain-variable of the right-hand side each of these equations, \(s\) must appear on the left-hand side (in the appropriate place); otherwise the equations are mathematical gibberish.
- The expressions “\(f(t)\)”, “\(g(t)\)”, “\(y(t)\)” are not functions; the functions whose Laplace transforms are being taken are \(f, g\), and \(y\).
There are two correct ways of writing line 4. One is $$ {\mathcal L}\{f\}(s)=F(s), \ \ \ {\mathcal L}\{g\}(s)=G(s), \ \ \ {\mathcal L}\{y\}(s)=Y(s).$$ The other is $${\mathcal L}\{f\}=F, \ \ \ {\mathcal L}\{g\}=G, \ \ \ {\mathcal L}\{y\}=Y.$$ A domain-variable for the function on the right-hand side of any of these equations must appear on both sides of the equation, or on neither side. This is the variable we are choosing to call \(s\), but a function does not depend on the notation used for its domain-variable.
In working with specific functions in the context of DEs, it is frequently cumbersome or otherwise inconvenient to introduce a name for each function. In these circumstances, we may allow ourselves some “abuse of notation”. For example, for the DE \({y’}’-5y’+6y=0\), if \(t\) is the independent variable, we generally allow ourselves to say (incorrectly) “\(\{e^{2t}, e^{3t}\}\) is a fundamental set of solutions,” rather than the correct, but lengthy, “\(\{y_1, y_2\}\) is a fundamental set of solutions, where the functions \(y_1\) and \(y_2\) are defined by \(y_1(t)=e^{2t}\) and \(y_2(t)=e^{3t}\).” (A correct way to avoid this whole problem, without any abuse of notation, is to write “\(\{t\mapsto e^{2t}, t\mapsto e^{3t}\}\) is a fundamental set of solutions,” but unfortunately most students at the level of this course have not been taught the “\(\mapsto\)” notation for defining a function.) Similarly, we may allow ourselves some abuse of notation and write “\( {\mathcal L}\{e^{2t}\}(s) = \frac{1}{s-2}\).” But the reason for doing this is that we have a specific function, taking \(t\) to \(e^{2t}\), for which we don’t want to introduce a name. If we decide that we want to introduce a name, say \(f,\) for this function (so \(f(t)=e^{2t}\)), then we do not write “\( {\mathcal L}\{e^{2t}\}(s) = \frac{1}{s-2}\)” anymore; we write “\( {\mathcal L}\{f\}(s) = \frac{1}{s-2}\).” But whether or not we have introduced a name for the function just called \(f\), the formula we have written for the Laplace transform of this function has “\(s\)” in it as the argument of the new (transformed) function. Thus, it is still very poor to write “\( {\mathcal L}\{e^{2t}\} = \frac{1}{s-2}\);” that’s like writing “\(F=\frac{1}{s-2}\)” instead of “\(F(s)=\frac{1}{s-2}\).” (Poor as this notation is, it is unfortunately very common; Z&W is by no means the only textbook that does this.)
In the context of line 4 of p. 275, the circumstances justifying the abuse of notation in “\( {\mathcal L}\{e^{2t}\}\)” are not present. The functions being Laplace-transformed are non-specific functions that we have named: \(f, g\), and \(y\). The abuse of notation “\(f(t), g(t)\), and \(y(t)\)” is not justifiable on this line. At the level of this course, students may be forgiven for writing something like what’s on line 4, but they should absolutely not be taught to write this way.
- p. 276, “\({\mathcal L}\) is a Linear Transform”.
- The line between the first displayed equation and equation (3) is inadequate. This line should read as follows:
- “whenever both integrals converge. Hence if \(c\) is a real number such that both integrals converge for all \(s>c\), then for all \(s>c\) we have”
- The notation in equation (3) is invalid; see comment for p. 275. After making the correction above, equation (3) should be written as
$${\mathcal L}\{\alpha f + \beta g\}(s) = \alpha{\mathcal L}\{f\}(s) + \beta{\mathcal L}\{g\}(s) = \alpha F(s) + \beta G(s).$$
- The sentence after equation (3) is misleading, because it ignores the fact that (3) was derived only under the assumption that there exists a number \(c\) such that both integrals converge (equivalently, that both \(F(s)\) and \(G(s)\) exist) for all \(s>c\). This assumption can be justified, but the authors fail to say so. The justification lies in the following nontrivial fact (beyond the scope of this course): Every “Laplace-transformable” function—a function for which \(f\) for which \(F(s)\ (={\mathcal L}\{f\}(s) )\) exists for some \(s\)—has the feature that there is some real number \(c\) for which domain of \(F\) contains the interval \((c,\infty)\). (Theorem 7.1.2 on p. 277, restated correctly below, guarantees this for functions \(f\) that are piecewise-continuous on \([0,\infty)\) and of exponential order. Nothing important in this chapter would be lost if these were the only functions for which Laplace transforms were discussed, but since the authors have elected to state the linearity property without any such restriction, in this “\({\mathcal L}\) is a Linear Transform” paragraph it is not valid to assume that there is any \(c\) such that \(F(s)\) exists for all \(s>c\).) The statement that such a \(c\) exists can be made without referring to any specific \(c\), as follows: \(F(s)\) exists for all \(s\) sufficiently large.
Note that if \(c<d\) and the domain of \(F\) includes \((c,\infty)\), then it also includes \((d,\infty)\); thus for each Laplace-transformable function \(f\) there is actually an infinite set of \(c\)'s for which the domain of \(F\) includes \((c,\infty)\). The \(c\)'s that "work" depend on the function \(f\). However, given two Laplace-transformable functions \(f\) and \(g\), there is at least one \(c_1\) such that \(F(s)\) is defined for all \(s\)>\(c_1\), and at least one \(c_2\) such that \(G(s)\) is defined for all \(s\)>\(c_2\). Taking \(c\) to be the larger of \(c_1\) and \(c_2\), both \(F(s)\) and \(G(s)\) are defined for all \(s\)>\(c\).
The existence of such a \(c\) is important, because it ensures that the functions \(F\) and \(G\) have a nonempty common domain: a set of \(s\)’s such that both \(F(s)\) and \(G(s)\) are defined. Two functions cannot be added together (which is something we’re doing in equation (3)) unless their domains have at least one element in common. For example, if one function has domain \( [2,\infty)\) and the other has domain \((-60,-50)\), there are no numbers simultaneously in both domains, so addition of the two functions becomes meaningless. Addition of two functions is defined only on their common domain. For our functions \(F\) and \(G\) above, the common domain includes the interval \((c,\infty)\) for some \(c\).
Knowing that such a \(c\) exists whenever both \(f\) and \(g\) are Laplace-transformable, we can state the most important outcome of the argument leading to equation (3) as follows:
- If \(f,g\) are Laplace-transformable and \(\alpha,\beta\) are real numbers, then \(\alpha f +\beta g \) is Laplace-transformable, and for all \(s\) sufficiently large we have
$${\mathcal L}\{\alpha f + \beta g\}(s) = \alpha F(s) + \beta G(s).$$This is what is meant by the statement that “the Laplace transform is linear.”
Note to instructors: The content of the above is that \({\mathcal L}\) is linear as a map from the vector space of Laplace-transformable functions to the vector space of “real-valued germs at infinity”.
- The line between the first displayed equation and equation (3) is inadequate. This line should read as follows:
- (i) Let \(c\) be a real number. A function \(f\) defined on \([0,\infty)\) is said to be of exponential order \(c\) if there exist constants \(M,T>0\) such that \(|f(t)|\leq Me^{ct}\) for all \(t\geq T\).
(ii) A function \(f\) defined on \([0,\infty)\) is said to be of exponential order if \(f\) is of exponential order \(c\) for some \(c\).
In part (i) of this definition, it is immaterial whether the inequality at the end is \(t\geq T\) or \(t> T\); either inequality leads to the same functions being called “of exponential order \(c\)”. It is also permissible to used the phrase “of exponential order \(\geq c\)” in part (i) of this definition (instead of “of exponential order \(c\)”), because for any \(t\geq 0\), if \(c_1<c_2\) and \(|f(t)|\leq Me^{c_1t}\) then \(|f(t)|\leq Me^{c_2t}\) as well.
The second sentence in this paragraph has the potential to stunt students’ growth, by indulging an unnecessary reliance on l’Hopital’s Rule. Over-reliance on l’Hopital’s Rule is a common habit of students that interferes with their understanding of common functions. The fact that for any real \(c\)>0 and \(r\geq 0\), the function \(f(t)=\frac{t^r}{e^{ct}} =t^re^{-ct}\) is bounded, can be easily and more informatively deduced without using l’Hopital’s Rule. The Calculus-1 fact that a differentiable function \(f\) is increasing on any interval on which \(f’\) is positive, and decreasing on any interval on which \(f’\) is negative, is all that is needed. For (say) \(r=100\), we need only take one derivative, not 100, to determine that \(t^re^{-ct}\) is bounded, and the argument is no harder if \(r=99.137\).
- If \(f\) is piecewise continuous on \([0,\infty)\) and is of exponential order \(c\), then \({\mathcal L}\{f\}(s)\) exists for \(s>c\).
(See comment on p. 277, Definition 7.1.2.)
The constant function 1 is the Laplace transform of the distribution \(\delta_0\) (the “delta-function \(\delta(t)\)”). But despite its name, a delta-function is not a function on \([0,\infty)\) (or on any subset of the real line), even if we extend the notion of “function” to allow functions to take the values \(\infty\) and \(-\infty\). A delta-function is an example of something called a distribution or generalized function. Distributions are examples of (non-differential, linear) operators, “functions of functions“, not functions of numbers (or of points in \({\bf R}^2, {\bf R}^3\), etc.).
There are several ways of thinking of delta-functions, the most naive of which gives the appearance that a delta-function is a (type of) function taking values in the extended real numbers (the reals with \(\infty\) and \(-\infty\) thrown in)—but with a magical integration property that no extended-real-valued function can actually have. This “naive” view is actually very useful and intuitive, especially when first learning about delta-functions in physics. The value of this viewpoint should not be minimized, even though it turns out to be mathematically unsound. But a mathematics textbook should not give students the impression that a delta-function is just “another kind of function” on the real line, which it absolutely is not. A mathematics textbook should at least be true to the meaning of the word function. To view a delta-function as a function (of any kind), we have to make its domain a HUGE set of real-valued functions, not a set of numbers. The Z&W Remark may wrongly give the student the impression that the reason he or she might not be able to think of a function whose Laplace transform is 1 is that the formula for such a function is complicated. In actuality, the reason the student who’s carefully read Section 7.1 should not be able to think of a function whose Laplace transform is 1 is that there IS no function \(f\) to which Definition 7.1.1 applies for which \(F(s)=1\).
The motivation behind this Z&W Remark may be the fact that delta-functions are extremely important, and that they will arise and be used in later sections of this chapter. Their use later in this chapter is completely appropriate. But making this Remark in Section 7.1, when the only objects for which Laplace transform has been defined are true, real-valued functions on \([0,\infty)\), is premature, and the chosen wording is misleading.
If a given function \(F\) equals \({\mathcal L}\{f\}\) for some piecewise-continuous function \(f\), then \(F\) is the Laplace transform of infinitely many piecewise-continuous functions, at most one of which is continuous. If there is such a continuous function \(f\) (for a given \(F\)), we can reasonably choose to call \(f\) the inverse transform of \(F\), because the continuity-restriction singles out this \(f\) from among all the functions whose Laplace transform is \(F\). It is always incorrect to use the definite article “the” if the object being defined is not uniquely determined by information in the definition.
The authors attempt to deal with this problem in Remark (i) on p. 287, but the Remark comes too late and misses the point. It is true, and important, that for purposes of the forward Laplace transform, two piecewise-continuous functions \(f_1, f_2\) of exponential order are “essentially the same” (and will have the same Laplace transform) if they differ at only finitely points in any bounded interval (i.e. if for any \(T\)> 0 the set \(\{t\in [0,T]: f_1(t)\neq f_2(t)\}\) has only finitely many elements). However:
- The place to make this point was back in Section 7.1. There, it would actually have been worthwhile to point out that for purposes of integration, hence for purposes of taking the (forward, not inverse) Laplace transform, functions \(f_1, f_2\) related to each other as above are “essentially the same”.
- In solving DEs, the only functions whose inverse Laplace transforms we take are functions \(Y\) that are the Laplace transforms of solutions \(y\) of differential equations. By definition, solutions of DEs are continuous functions. So we are never in the situation of wanting to define or compute the inverse Laplace transform of a function \(F\) for which there is no continuous \(f\) whose forward Laplace transform is \(F\).
In other words, the problem that Remark (i) attempts to address has been mis-identified. The fact that different piecewise-continuous functions \(f\) can have the same Laplace transform is a non-issue for solving DEs. The problem leading to Remark (i) is simply the poor definition of “inverse Laplace transform” on p. 281. There would be no problem to address in Remark (i) at all had “inverse Laplace transform” been defined correctly on p. 281.